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METHOD FOR EVALUATION OF POLYMORPHIC GENETIC SEQUENCES, ' 
AND THE USE THEREOF IN IDENTIFICATION OF HL A TYPES _ } 

DESCRIPTION 

BACKGROUND OF THE INVENTION 

Genetic testing to determine the presence of or a susceptibility to a disease condition . 
offers incredible opportunities for improved medical care, and the potential for such testing 
increases almost daily as ever increasing numbers of disease-associated genes and/or ', 

i 

mutations are identified. A major hurdle which must be overcome to realize this potential* 
however, is the high cost of testing. This is particularly true in the case of highly j I 
© polymorphic genes where the need to test for a large number of variations may malj;e the f 

test procedure appear to be so expensive that routine testing can never be achieved. i 

Testing for changes in DNA sequence can proceed via complete sequencing of a target 
nucleic acid molecule, although many persons in the art believe that such testing is too ? 
expensive to ever be routine. Changes in DNA sequence can also be detected by a: I 
technique called tingle-stranded conformational polymorphism" ( "SSCP") described by . 
Orita et al.„ Genomics 5: 874-879 ( ? 989), or by a modification thereof referred to a \ 
dideoxy-fingerprinting ("ddF") described by Sarkar et al., Genomics 13: 4410443 (1992). ; 
SSCP and ddF both evaluate the pattern of bands created when DNA fragments are 
electrophoretically separated on a non-denaturing electrophoresis gel. This pattern depends 
on a combination of the size of the fragments and of the three-dimensional conformation of 
the undenatured fragments. Thus, the pattern cannot be used for sequencing, because the 
theoretical spacing of the fragment bands is not equal. 

The hierarchical assay methodology described in US Patent No. 5,545,527 and ■? 
International Patent Publication No. WO 96/07761, which are incorporated herein by 
reference, provides a mechanism for systematically reducing the cost per test by utilizing a 
series of different test methodologies which may have significant numbers of results 
incorrectly indicating the absence of a genetic sequence of interest, but which rarely if ever 
yield a result incorrectly indicating the presence of such a genetic sequence. The tests 
employed in the hierarchy may frequently be combinations of different types of molecular 
tests, for examples combinations of immunoassays, oligonucleotide probe hybridization ; 
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tests, oligonucleotide fragment analyses, and direct nucleic acid sequencing. This 
application relates to a particular type of test which can be useful alone or as part of a 
hierarchical testing protocol, particularly for highly polymorphic genes. A particular 
example of the use of this test is its application to determining the allelic type of human 
HLA genes, although the test is applicable to many genes of known sequence, and the 
invention should not be construed as limited to HLA. 

Human HLA genes are pan of the major histocompatability complex (MHC), a cluster 
of genes associated with tissue antigens and immune responses. Within the MHC genes are 
two groups of genes which are of substantial importance in the success of tissue and organ 
transplants between individuals. The HLA Class I genes encode transplantation antigens 
which are used by cytotoxic T cells to distinguish self from non-self The HLA class II 
genes, or immune response genes, determine whether an individual can mount a strong 
response to a particular antigen. Both classes of HLA genes are highly polymorphic, and in 
fact this polymorphism plays a critical role in the immune response potential of a host. On 
the other hand, this polymorphism also places an immunological burden on the host 
transplanted with allogeneic tissues. As a result, careful testing and matching of HLA types 
between tissue donor and recipient is a major factor in the success of allogeneic tissue and 
marrow transplants. 

Typing of HLA genes has proceeded along two basic lines: serological and nucleic 
acid-based. In the case of serological typing, antibodies have been developed which are 
specific for certain types of HLA proteins. Panels of these tests can be performed to 
evaluate the type of a donor or recipient tissue. In nucleic acid based-approaches, samples 
of the HLA genes may be hybridized with sequence-specific oligonucleotide probes to 
identify particular alleles or allele groups. In some cases, determination of HLA type by 
sequencing of the HLA gene has also been proposed. Santamaria P, et al "HLA Class 1 
Sequence-Based Typing", Human Immunology 37; 39-50 (1993) 

In all of these cases, the test panel performed on each individual sample is extensive, 
with the result that the cost of HLA typing is very high. It would therefore be desirable to 
have a method for typing HLA which provided comparable or better reliability at 
substantially reduced cost. It is an object of the present invention to provide such a 
method. 
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SUMMARY OF THE INVENTION 
The method of the invention makes use of a modification of standard sequencing 
technology, preferably in combination with improved data analysis capabilities to provide a 
streamlined method for obtaining information about the allelic type of a sample of genetic 
material. Thus, in accordance with the invention, the allelic type of a polymorphic genetic 
locus in a sample is identified by first combining the sample with a sequencing reaction 
mixture containing a template-dependent nucleic acid polymerase, A, T, G and C 
nucleotide feedstocks, one type of chain terminating nucleotide and a sequencing primer 
under conditions suitable for template dependant primer extension to form a plurality of 
oligonucleotide fragments of differing lengths, and then evaluating the length of the 
oligonucleotide fragments. As in a standard sequencing procedure, the lengths of the 
fragments can be evaluated on a denaturing gel, such that the actual length of each 
fragment, independent of conformational changes that may be caused by sequence 
variations is determined. The observed bands therefore indicate the positions of the type of 
base corresponding to the chain terminating nucleotide in the extended primer. The method 
of the invention differs from standard sequencing procedures, however, because instead of 
performing and evaluating four concurrent reactions, one for each type of chain terminating 
nucleotide, in the method of the invention the sample is concurrently combined with at 
most three sequencing reaction mixtures containing different types of chain terminating 
nucleotides. Preferably, the sample will be combined with only one reaction mixture, 
containing only one type of chain terminating nucleotide and the information obtained from 
this test will be eva^-Ued prior to performing any additional tests on the sample. 

In many cases, evaluation of the positions of only a single base will allow for allelic 
typing of the sample. In this case, no further tests need to be performed. Thus, the use of 
the method of the invention can increase laboratory throughput (since up to four times as 
many samples can be processed on the same amount of equipment) and reduce the cost per 
test by up to a factor of four compared to sequencing of all four bases for every sample. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 shows the application of the invention lo typing of a simple polymorphic gene; 
Fig. 2 illustrates an improved method for distinguishing heterozygotic alleles using the 
present invention, 

Fig. 3 illustrates a situation in which heterozygote pairs remain ambiguous even after 
full sequencing; 

Fig. 4 illustrates the use of a control lane to evaluate the number of intervening bases in 
i a single base sequencing reaction; 

F-ig. 5 shows results from an automated DNA sequencing apparatus; 

ii 

i Ijjig. 6 illustrates peak-by-peak correlation of sequencing results; 

■ Fig. 7 shows a plot of the maxima of each data peak plotted against the separation 

i 

' from ihe nearest other peak; and 

£igs. 8A-8C illustrate the application of the invention to typing of Chlamydia 
trachomatis, 

} DETAILED DESCRIPTION OF THE INVENTION 

; While the terminology used in this application is standard within the art, the following 
; definitions of certain terms are provided to assure clarity. 

i 

1 . " Allele" refers to a specific version of a nucleotide sequence at a polymorphic genetic 
locus. 

2. "Polymorphism" means the variability found within a population at a genetic locus. 

3. "Polymorphic site" means a given nucleotide location in a genetic locus which is 
variable within a population. 

4. "Gene" or "Genetic locus" means a specific nucleotide sequence within a given 
genome. 

5. The "location" or "position" of a nucleotide in a genetic locus means the number 
assigned to the nucleotide in the gene, generally taken from the cDN A sequence or the 
genomic sequence of the gene. 

6. The nucleotides Adenine, Cytosine, Guanine and Thymine are sometimes represented 
by their designations of A, C, G or T, respectively 
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While it has long been apparent to persons skilled in the art that knowledge of the 
identity of the base at a particular location within a polymorphic genetic locus may be 
sufficient to determine the allelic type of that locus, this knowledge has not led to any 
modification of sequencing procedures Rather, the knowledge has driven development of 
techniques such as allele-specific hybridization assays, and allele-specific ligation assays. 
Despite the failure of the art to recognize the possibility, however, it is not always 
necessary to determine the sequence of all four nucleotides of a polymorphic genetic locus 
in order to determine which allele is present in a specific patient sample. Certain alleles of a 
genetic locus may be distinguishable o" *he basis of identification of the location of less 
than four, and often only one nucleotide. This finding allows the development of the 
present method for improved allele identification at a polymorphic genetic locus. 

A simple example is to consider a polymorphic site for which only two alleles are 
known, as in Figure 1 . In this case, identification of the location of the A nucleotides in the 
genetic locus, particularly at site 101, will distinguish whether allele 1 or allele 2 is present. 
If a third allele was discovered which had a C at site 101, the presence of the allele could be 
distinguished either by the absence at site 101 of an A and a T in independent A and T 
reactions qt by the presence of a C at site 101. 

Traditionally, if sequencing were going to be used to evaluate the allelic type of the 
polymorphic site of Fig. 1, four dideoxy nucleotide "sequencing" reactions of the type 
described by Sanger et al. (Proc. Natl. Acad. Sci. USA 74: 5463-5467 (1977)) would be 
run on the sample concurrently, and the products of the four reactions would then be 
analyzed by poly aery lamide gel electrophoresis, (see Chp 7.6, Current Protocols in 
Molecuiar Biology, Eds. Ausubel, F.M. et al, (John Wiley & Sons; 1995)) In this well 
known technique, the each of the four sequencing reactions generates a plurality of primer 
extension products, all of which end with a specific type of dideoxy-nucleotide. Each lane 
on the electrophoresis gel thus reflects the positions of one type of base in the extension 
product, but does not reveal the order and type of nucleotides intervening between the 
bases of this specific type. The information provided by the four lanes is therefore 
combined in known sequencing procedures to arrive at a composite picture of the sequence 
as a whole. \ 
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In accordance with the present invention, however, single sequencing reactions are 
performed and evaluated independently to provide the number of intervening bases between 
each instance of a selected base and thus a precise indication of the positional location of 
the selected base. Applying the method of the invention to the simplistic example of Fig. I, 
a single sequencing reaction would first be performed using either dideoxy-A or dideoxy-T 
as the chain terminating nucleotide. If the third allelic type did not exist or was unknown, 
this single test would be enough to provide a specific result. If the third allelic type was 
known to exist and the base present in tfie sample was not identified by the first test, a 
second sequencing test could be performed using either dideoxy-C or the dideoxy-A/T not 

used in the first test to resolve the identity of the allelic type . Alternatively, some other test 

'.i 

such as an allele-specific hybridization probe or an antibody test which distinguished well 
between allele 1 or 2 and allele 3 could Be used in this case. 

As is clear from this example, the method of the invention specifically identifies 
"known" alleles of a polymorphic locus^and is not necessarily useful for identification of 
new and hitherto unrecorded alleles. An unknown allele might be missed if it were 
incorrectly assumed that the single nucleotide sequence obtained from a patient sample 
corresponded to a unique allele, when in fact other nucleotides of the allele had been 
rearranged in a new fashion. The: method is specific for distinguishing among known alleles 
of a polymorphic locus (though it may fortuitously come across new mutations if the right 
single nucleotide sequence is chosen). Databases listing known alleles must therefore be 
continually updated to provide greatest utility for the invention. 

The advantages of "less than 4" nucleotide analysis of the invention for identifying 
alleles are the decrease in costs for reagents and labor and the increased throughput of 
patient samples that can be obtained in a diagnostic laboratory. These advantages can be 
more dramatically demonstrated by considering a system which more closely approximates 
a real world example. For this purpose, ;we have assumed a population in which only the 
known HLA Class II DR4 alleles exist (of these, 5 alleles DRB 1*0401,. DRB 1*0402, 
DRB 1*0405, DRB 1*0408, and DRB I *0409 are found in 95% of the North American 
population), and in which these alleles are always homozygous. 

To determine the order in which the single nucleotide sequences should be performed, 
the sequence differences among alleles are evaluated to determine which of the bases will 
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yield the most information, and the circumstances in which knowledge about two or more 
bases yields a definitive typing. To do this, we look first at base A, for example, to 
determine which alleles can be identified unequivocally from a knowledge of the position of 
the A bases within the sample. One way to approach this is to set up a table which shows 
the base for each allele at each polymorphic site, as shown in Table 1, and to determine the 
pattern which would be observed if the A's in the table were detected. Each unique pattern 
tan be definitively typed using this one sequencing reaction. For the DR4 alleles, every 
allele (including all of the most widely distributed alleles) except DRB1*04I3 and 
JDRB 1*0416 produces a unique patten. All of the other bases effectively identify fewer 

i 

^allelic types, and therefore the A reaction is done first. Further, it is very likely that any 

\ 

jgiven group of samples could be entirely typed using this single sequencing reaction. In the 
ievent that samples were not definitively typed using this first sequencing reaction, any 
/second sequencing reaction performed on the untyped samples would distinguish between 
X>RBI*0413and DRBP0416 

5 The significance in terms of cost per test of using the method of the invention is 

•easily appreciated. Determining the DR4 allelic type of 100 samples using traditional 4 

^nucleotide DNA sequencing requires performance of a total of 400 sequencing reactions. 

i Assuming a cost (reagents plus labor) of $20.00 per test, this would result in a cost per ; 

\ patient of $80.00. In contrast, in the test using the method of the invention, only the first 
test for the positions of A is performed on all samples. Even assuming the statistically 
unlikely event that 5% of the samples are of type DRB 1 *0413 or DRB1*0416, 95 positive 

: typings will result. The remaining 5 samples are tested using a second (G, C or T) 
sequencing reaction, with the result that all 5 samples are definitively typed. Thus, the cost 
for performing these 100 typings using the method of the invention is $2,100 or $21 per 

• patient. J 
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In some cases, the second sequencing reaction performed may not yield unique 
patterns for all of the samples testecj. In this case, prior to performing a third sequencing 
reaction, it is desirable to combine the results of the first two sequencing reactions and 
evaluate these composite results for unique base patterns. Thus, for example, a first and 
second sequencing reaction may have four alleles which can be characterized as follows 



A pattern T pattern 

Allele 1 13 2, 2211 

Allele 2 13 2? 2 4 11 

Allele 3 , 3;4 2 j 22 11 

Allele 4 ; 3;4.2 \ 23 11 

i *' 
i 

Allele 2 and Allele 4 give unique results from the T-sequence reaction alone, and can 
therefore be typed based upon this information. Alleles 1 and 3, however have the same T- 
sequencing pattern. Because these two allele have different A-sequencing reaction patterns, 
however, they are clearly distinguishable and can be typed based upon the combined 
patterns without further testing, t 

This substantial reduction in the number of sequencing reactions means that the cost of 
reagents and labor required to perform the reactions is reduced. Further, since each sample 
must be analyzed by electrophoresis, fewer electrophoresis runs need to be performed. For 
example, in an automated DN A sequencer having 40 lanes, such as the Pharmacia A.L.F ™ 
(Pharmacia, Uppsala, Sweden), up to 40 patient samples can be run on a gel rather than 10 
patient samples using 4 lanes each. In systems such as the Applied Biosystems Inc. 377™, 
(Foster City, CA) which permit the use of 4 fluorescent dyes per lane, 4 patient samples 
may be run per lane instead cf one patient sample per lane. Use of networked high-speed 
DN A sequencers with software that can combine data taken from different instruments, 
such as the MICROGENE BLASTER™ sequencer and GENE OBJECTS™ software, 
(both part of the OPEN GENE™ System available from Visible Genetics Inc., Toronto, 
Canada) can also enhance this method. 

This same methodology can be applied to virtually any known polymorphic genetic 
locus to obtain efficient characterization of the locus. For example, identification of alleles 
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in the highly polymorphic Human Leukocyte Antigen (HLA) gene system (Parham, P. et al. 
"Nature of Polymorphism in HLA- A, -B and -C Molecules", Proc. Natl. Acad. ScLUSA 
85: 4005-4009 ( 1 988)) will benefit greatly from the method. Moreover, the method is not 
limited to human polymorphisms. It may be used for other animals, plants, bacteria, viruses 
or fungi It may be used to distinguish the allelic variants present among a mixed sample of 
organisms In human or animal diagnostics, the method can be used to identify which 
subspecies of bacteria or viruses are present in a bod^ sample. This diagnosis could be 
essential for determining whether drug-resistant strains of pathogens are present in an 
individual. 

After developing an assay methodology in the manner outlined above for a particular 
known polymorphic gene, the first step of the method of the invention is obtaining a 
suitable sample of material for testing using this methodology. The genetic material tested 
using the invention may be chromosomal DN A, messenger RNA, cDNA, or any other form 
of nucleic acid polymer which is subject to testing to evaluate polymorphism, and may be 
derived from various sources including whole blood, tissue samples including tumor cells, 
sperm, and hair follicles. 

In some cases, it may be advantageous to amplify the sample, for example using 
polymerase chain reaction (PCR) amplification, to create one which is enriched in the 
particular genetic sequences of interest. Amplification primers for this purpose are 
advantageously designed to be highly selective for the genetic locus in question. For 
example, for HLA Class I testing, group specific and locus specific amplification primers 
have been disclosed in US Patent No. 5,424,184 and Cereb et al., "Locus-specific 
amplification of HLA class 1 genes from genomic DNA: locus-specific sequences in the 
first and third introns of HLA-A, -B and -C alleles." Tissue Antigens 45:1-1 1 (1995) which 
are incorporated herein by reference. 

Once a suitable sample is obtained, the sample is combined with the first sequencing 
reaction mixture. This reaction mixture contains a template-dependent nucleic acid 
polymerase. A, T, G and C nucleotide feedstocks, one type of chain terminating nucleotide 
and a sequencing primer. 

> The selection of the template-dependent nucleic acid polymerase is not critical to the 
success of the invention. A preferred polymerase, however, is Thermo Sequenase™, a 
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thermostable polymerase enzyme marketed by Amersham Life Sciences; Other suitable 
enzymes include regular Sequenase™ and other enzymes used in sequencing reactions. 

Selection of appropriate sequencing primers is generally done by finding a part of the 
gene, either in an intron or an exon, that lies near (within about 300 nt) the polymorphic 
region of the gene which is to be evaluated, is 5' to the polymorphic region (either on the 
sense or the antisense strand), and that is highly conserved among all known alleles of the 
gene. A sequencing primer that will hybridize to such a region with high specificity can then 

be used to sequence through the polymorphic region. Other aspects of primer quality, such 

i 

is lack of palindromic sequence, and preferred G/C content are identified in the US Patent 
No 5,545,527. I 

In some cases it is impossible to select one primer that can satisfy ill the above 
demands. Two or more primers may be necessary to test among some fcub-groups of a 

■ i 

genetic locus. In these cases it is necessary to attempt a sequencing reaction using one of 
the primers. If hybridization is successful, and a sequencing reaction proceeds, then the 
results can be used to determine allele identity. If no sequencing reactions occur, it may be 
necessary to use another one of the primers. 

The sequencing reaction mixture is processed through multiple cycles during which 
primer is extended .and then separated from the template DNA from the sample and new 
primer is reannealed with the template. At the end of these cycles, the product 
oligonucleotide fragments are separated by gel electrophoresis and detected. This process 
is well known in the an. Preferably, this separation is performed in an apparatus of the type 
described in US Patent Application No. 08/353,932, the continuation in part thereof filed 
on December 12, 1995 as Imernational Patent Application No. PCT/US95/ 15951 using thin 
microgels as described in International Patent Application No. PCT/US95 14531, all of 
which applic-^ons are incorporated herein by reference. 

The practice of the instant invention is assisted by technically advanced methods for 
precisely identifying the location of nucleotides in a genetic locus using single nucleotide 
sequencing. The issue is that in the technique of single nucleotide sequencing using 
dideoxy-sequencing/ electrophoresis analysis it is sometimes a challenge to determine how 
many nucleotides fall between two of the identified nucleotides. 
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A _ AA or A AA 

In many cases, there is little difficulty, particularly when short sequencing reaction products 
are examined (200 nt or less), because the electrophoretic separation of reaction products 
follows a highly predictable pattern. A computer or a human can easily determine the 
number of nucleotides lying between two identified nucleotides by simply measuring the 
gap and determining the number of singleton peaks that would otherwise fall in the gap. 
The problem becomes relevant in longer electrophoresis runs where resolution and 
separation of sequencing reaction fragments is lost. In addition, loss of consistency in 
maintaining the temperature, electric field strength or other operating parameters can lead 
to inconsistencies in the spacing between peaks and ambiguities in interpretation. Such 
ambiguities can prevent accurate identification of alleles. 

One simple way to resolve these problems is to run a "control" lane with all samples 
which identifies all possible nucleotide fragment lengths from the genetic locus being 
sequenced, for example by performing a reaction which includes all 4 dideoxy nucleotides. 
The control lane indicates precisely the number of nucleotides that lie in the gaps between 
the identified nucleotides, as in Fig. 3; 

Any sequencing format can use such a control lane, be it "manual" sequencing, using 
radioactively labeled oligonucleotides and autoradiograph analysis (see Chp 7, Current 
Protocols in Molecular Biology, Eds. Ausubel, F.M. et al, (John Wiley & Sons; 1995)), or 
automated laser fluorescence systems. 

An improved method for identifying alleles, which does not rely on measuring the 
number of nucleotides lying between two identified nucleotides is disclosed in US Patent 
Application Serial No. 08/497,202. Briefly, this method relies on the actual shape of the 
data signal ("wave form") received from an automated laser fluorescence DNA analysis 
system. The method compares the patient sample wave form to a database of wave forms 
representing the known alleles of the gene. The known wave form that best matches the 
sample wave form identifies the allele in the sample. 

A further embodiment of the invention which may be applied in some cases, including 
HLA typing, to further expedite and reduce the expense of testing, involves the 
simultaneous use of two chain terminating nucleotides in a single reaction mixture. For 
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example, a single reaction containing a mixture of ddATP and ddCTP could be performed 
initially. The peaks observed on the sequencing gel are either A or C, and cannot be 
distinguished (unless dye-labeled terminators with different labels are used). In some cases, 
however, this information is sufficient to identify the nature of the allele. .For example, in 
the simple three allele case shown in Fig. I, the sequence information would identify the T 
allele unambiguously. For more complicated polymorphic genes, a second sequencing nm, 
including two chain terminating nucleotides, one being the same as one included in the first 
reaction and the other being different from those included in the first reaction mixture. 
These two sequencing procedures permit determination of the position of three bases 
expressly and the fourth base by difference in a total of only two reactions, 
(y^v) As discussed below, some wave forms may represent heterozygote mixtures. The 

database should include wav^ iuinis from all known heterozygote combinations to ensure 
that the matching process includes the full variety of possibilities. When a patient sample is 
found to be a possible heterozygote, the software can be designed to inform the user of the 
next analytical test that should be performed to help distinguish among possible allelic 
members of the heterozygote. 

Heterozygous polymorphic genetic loci need special consideration. Where more than 
' one variant of the same loci exists in the patient sample, complex results are obtained when 
single lane sequencing begins at a commonly shared sequencing primer site. This problem 
is also found in traditional 4 lane sequencing (see Santamaria P, et al "HLA Class 1 
Sequence-Based Typing" Human Immunology 37, 39-50 (1993)). However, Figure 2 
illustrates an improved method for distinguishing heterozygotic alleles using the present 
invention. 

The problem presented by a heterozygous allele is illustrated in Fig. 2a. The observed 
data from single nucleotide sequencing of the A lane can not point to the presence of a 
unique allele. Either the loci is heterozygous or a new allele has been found. (For well 
studied genetic loci, new alleles will be rare, so heterozygosity may be assumed.) The 
problem flows from a mixture of alleles in the patient sample which is analyzed. For exam- 
ple, the observed data may result from the additive combination of allele 1 and allele 2. 
; Where there are more than two possible alleles, it is necessary to compare each of the 

known allelic variants to the observed data to see if they could result in the observed data. 
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'. Each heterozygote pair will have its own distinct pattern. Fig 3b illustrates that alleles 3 
and 4 can not underlie the observed data because certain A nucleotides in those alleles are 
not represented in the data. They are thus eliminated from consideration. The remaining 
alleles 5, 6, and 7 could be used in combination with others to generate the observed data 

In the case of human genomic DNA, only two alleles at any one loci can generally be 
present (one from each chromosome). It is necessary, therefore, to combine all known 
alleles to determine if they can be additively combined to result in the observed data. (In 
fact, the data appearance of known and hypothetical heterozygote pairs can be prepared 
and stored in an additional database to facilitate analysis.) In Fig 3b combination of alleles 
5 and 6 will result in the observed data, and combination of neither 5 & 7 nor 6 & 7 gives 
the desired result. Therefore, if only the alleles 3 to 7 were known, the only two that could 
possibly be combined to result in the observed data would be 5 and 6. Allelic identification 
could be made on this basis. 

In some cases, where more than one pair of alleles can be combined to obtain the 
observed data, as in Fig 3 c, it is necessary to determine the relative locations of other 
nucleotides in order to distinguish which allelic pair is present. Identification of another 
specific type of nucleotide serves to distinguish which pair of alleles is present. Fig 3d 
shows further, that sometimes observed data may appear to be a homozygote for one allele, 
but in fact it may consist of a heterozygote pair, either including the suggested allele, or 
not The alleles that might lead to such confusion, by masking possible heterozygotes, can 
be identified in the known allele database. Identification of these alleles can not be 
confirmed unless further tests are made which can confirm whether a heterozygote 
underlies the observed data. 

All of the analyses of comparing the known alleles to the observed data can be conven- 
iently assisted by the use of high speeJ computer analysis. 

In rare cases, such as in Fig. 4, sequencing of all 4 nucleotides will not permit identifi- 
cation of which allelic pair is present. The ambiguity may be reported as such, especially if 
the clinical need for distinguishing is low. Alternatively, high stringency hybridization 
probes may be used, as they can identify the presence of specific allelic variants. Protocols 
for hybridization probes are well known in the art (see Chp 6.4, Current Protocols in 
Molecular Biology, Eds. Ausubel, F.M et al t (John Wiley & Sons; 1995)). 
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Occasionally, quantitative measurements of the amount of sequencing reaction 
products may be sufficient to distinguish whether only one allele has an A at a specific loci, 
or both, it is found experimentally, however, thai quantitative analysis of sequencing peak 
heights can only rarely assist in the analysis. 

Quantitative analysis proves more useful for resolving the problem of "allelic dropout". 
In cases of allelic dropout, sequencing reactions identify an apparent homozygote. but only 
because the sequencing primer has failed to initiate sequencing reactions on one of the two 
alleles This may have resulted from heterogeneity at the sequencing primer site itself, 
which prevents the primer from hyb. ' iizing to the target site or initiating chain extension. 
(This problem should be rare as sequencing primers according to the invention are designed 
to hybridize generally to highly conserved areas of the genome). 

Allelic dropout is resolved by amplifying both alleles from genomic DNA using 
quantitative polymerase chain reaction (see for example, Chp 15, Current Protocols in 
Molecular Biology, Eds. Ausubel, F.M. et al, (John Wiley & Sons; 1995)) The sequencing 
primer is used as one of a pair of PCR primers. A fragment of DNA spanning the alleles in 
question is amplified quantitatively. At the end of the reaction, quantities of PCR products 
will be only half the expected amount if only one allele is being amplified. Quantitative 
analysis can be made on the basis of peak heights of amplified bands observed by 
automated DNA sequencing instruments. 

A plurality of pathogens can produce even more complex results from single 
nucleotide sequencing. The complexity flows from an unlimited number of variants of the 
pathogen that may be present in the patient sample. For example, viruses, and bacteria may 
have variable surface antigen coding domains which allow them to evade host immune 
system detection. To avoid this problem of variability, the genetic locus selected for 
examination is preferably highly conserved among all variants of the path^en, such as 
ribosomal DNA or functionally critical protein coding regions ofDNA. Where variable 
regions of the pathogen must be analyzed, an extended series of comparisons between the 
observed data and the known alleles can assist the diagnosis by determining which alleles 
are not substantial components of the observed data. 

The method of the present invention lends itself to the construction of tailored kits 
which provide components for the sequencing reactions. As described in the examples, 
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nucleoside and didepxynucleoside preparations, and buffers for reactions. Unlike 
conventional kits, however, "the amount of each type of dideoxynucleoside required for any 
given assay is not the same. Thus, for an assay in which the A sequencing reaction is 
performed first and on ail samples, the amount of dideoxy-A included in the kit may be 5 to 
10 times greater than the amount of the other dideoxynucleosides. 

The following examples are included to illustrate aspects of the instant invention and 
are not intended to (imit the invention in any way. 

i 

i j ^cample 1 

Identification of HLA Class 11 gene alleles present in an individual patient sample can 
be performed usinglhe method of the instant invention. For example, DRB 1 is a 
polymorphic HLA Class H gene with at least 107 known alleles (See Bodmer et al. 
Nomenclature for Factors of the HLA System, 1994. Hum. Imm. 41, 1-20 (1994)). 

The broad serological subtype of the patient sample DRB1 allele is first determined by 
attempting to amplify the allele using group specific primers. 

Genomic DNA^is prepared from the patient sample using a standard technique such as 
proteinase K;proteaiysis. Allele amplification is carried out in Class II PCR buffer: 
10 mMTrispH8 4 
50 mM KC1 
1.5mMMgC12 
0.1% gelatin 

200 uM each of dATP, dCTP, dGTP and dTTP 
1 2 pmol of each group specific primer 
40 ng patient sample genomic DNA 

Groupware amplified separately. The group specific primers employed are: 

PRODUCT SIZE 

DR I 

5'-PRlMER; TTGTGGCAGCTTAAGTTTGAAT (Scq ID No. I| 195&196 

V-PRJMERS. CCGCCTCTGCTCCAGGAG l Scc » 10 No 21 

CCCGCTCGTCTTCCAGGAT l Sct l ID No 3 1 
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DR2(.I5 AND 16) 

5-PRIMER: TCCTGTGGC ACCCTAAG AG 
3'-PRIMERS: CCGCGCCTGCTCCAGGAT 
AGGTGTCCACCGCGCGGCG 

DR3.8.1 LI2.13.14 

.V-PRJMER: CACGTTTCTTGGAGTACTCTAC 
3'-PRIMER: CCGCTGCACTGTGAAGCTCT 
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|ScqIDNo.4| 197&2I3 
|Scq ID No. 5 1 
|Scq ID No. 6J 

IScqfDNo. 7| 270 
|Scq ID No. 8| 



DR4 

5*-PRlMER: GTTTCTTGGAGCAGGTTAAACA |Seq ID No. 9] 260 

.T-PRIMERS: CTGCACTGTGAAGCTCTCAC |Scq ID No. I0| 

CTGCACTGTG A AGCTCTCC A |Scq ID No. II 1 

DR7 

.V-PRJMER: CCTGTGGCAGGGTAAGTATA [Scq ID No. 12] 232 

T-PRIMER: CCCGTAGTTGTGTCTGCACAC (Scq ID No. 13] 

DRV 

.V-PRIMER: GTTTCTTGAAGCAGGATAACTTT [Scq ID No. 14| 236 

3'-PRIMER: CCCGTAGTTGTGTCTGCACAC |Scq ID No. 151 i 



(Scq ID No. 16] 204 
]Scq ID No. 17| 

The 5-primers of the above groups are terminally labelled with a fluorophore 
such as a fluorescein dye at the 5'- end. 
* The reaction mixture is mixed well. 2.5 units Taq Polymers are added and 

mixed immediately prior to thermocycling. The reaction tubes are placed in a Robocycler 
Gradient 96 (Stratagene, Inc.) and subject to thermal cycling as follows: 

1 cycle 94 C 2 min 

10 cycles 94 C 15 sec * 
67 C 1 min 



DR10 

>'-PRlMER: CGGTTGCTGGAAAGACGCG 
3'-PRlMER CTGCACTGTGAAGCTCTCAC 



BMSDOCID: <WO 97236 50 A2_L> 



WO 97/23650 1. J!CT/US9fi«0202 

20 cycles 94 C 10 sec ... - ; \ ;*. 

61 C 50 sec ; 

72 C 39 sec 
1 cycle 72 C 2 min 

4 C cool on ice until ready for electrophoretic analysis. 
Seven reactions (one for each group specific primer set) are performed. After 
amplification 2 UL of each of the PCR products are pooled, and mixed with 1 1 uL of 
loading buffer consisting of 100% fonnamide with. 5 mg/mi dextran blue. The products are 
run on a 6% polyacrylamide electrophoresis gel in an automated fluorescence detection 
apparatus such as the Pharmacia A.L.F.™ (Uppsala, Sweden). Size determinations are 
performed based on migration distances of known 'size fragments. The serological group is 
identified by the length of the successfully amplified fragment. Only one fragment will 
appear if both alleles belong to the same serological group, otherwise, for heterozygotes 
containing alleles from two different groups, two fragments appear. 

Once the serological group is determined, specificity within the group is 
determined by single nucleotide sequencing according to the invention. 

Each positive group from above is individually amplified for sequence analysis. 
The PCR amplification primers are a biotinylated T-PRIMER amp B: 

(5* Biotin-CCGCTGCACTGTGAAGCTCT 3' ) [Seq ID No. 8] 

and the appropriate 5-PRIMER described above. The conditions for amplification are 
identical to the method described above. 

After amplification sequencing is performed using the following sequencing 
primer: * 

5' - GAGTGTCATTTCTTCAA [Seq ID No. 18] 

The PCR product ( 10 ul) is mixed with 10 ulof washed Dynabeads M-280 (as 
per manufacturers recommendations, Dynal. Oslo, Norway) and incubated for 1 hr at room 
temperature. The beads are washed with 50 ul of IX BW buffer (10 mM Tris, pH 7.5, 1 
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vmMEDTA, 2M:NaCl) followed by 50 ul of IX TE buffer (10 mM Tris, I mM EDTA). 
After washing, resuspend the beads in 1 0 til of TE and take 3 ul for the sequencing reaction 
which consists of: 
3 ul bound beads 

3 ul sequencing primer (30 ng total) 

2 ul I OX sequencing buffer (260 mM Tris-HCI, pH 9.5, 65 mM MgCI2) 

2 ul of Thermo' Sequenase™ (Amersham Life Sciences, Cleveland) (diluted 1:10 from 
stock) j 

3 ul H20 \ 




Final Volume 4 13 ul. Keep this sequencing reaction mix on ice. 

i I 

? Remove 3 ul of the sequencing reaction mix and add to 3 ul of one of the 
following mixtures, depending on the termination reaction desired. 
A termination reaction: 



750 uM each of d ATP, dCTP, dGTP, and dTTP; 2.5 uM ddATP 
C termination Reaction : 

750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddCTP 



G termination reaction: 



750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddGTP 



T termination reaction. 



750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddTTP 



Total termination react : -.n volume: 6 ul 



Cycle the termination reaction mixture in a Robocycier for 25 cycles (or fewer if found to 
be satisfactory): [ 
95 C 30 sec : 



50 C 10 sec. 



-70 C 30 sec 
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Afier cycling add 12 ul of loading buffer consisting of 100%;formarhitiewth;5 
mg/ml dextran blue, and load appropriate volume to an automated DN A sequencing 
apparatus, such as a Pharmacia A.L.F. 

Allele identification requires analysis of results from the automated DN A sequencing 
apparatus as in Fig. 5. Fragment length analysis revealed that one allele of the patient 
sample was from the DR4 serological subtype (data not shown). Single nucleotide 
sequencing was then performed to distinguish among the possible DR4 alleles. Lane 1 
illustrates the results of single nucleotide sequencing for the "C" nucleotide of a patient; 
sample (i.e. using the C termination reaction, above). Lanes 2 and 3 represent C nucleotide 
sequence results for 2 of the 22 known DR4 alleles. Similar results for the 20 other alleles 
are stored in a database. The patient sample is then compared to the known alleles usirig 
one or more of the methods disclosed in US Patent Application Serial No. US 08/497,202. 

In Fig. 5, Lane 1 first requires alignment with the database results. The alignment 
requires determination of one or more normalization coefficients (for stretching or 
shrinking the results of lane 1 ) to provide a high degree of overlap (i.e. maximize the 
intersection) with the previously aligned database results. The alignment co-efficient(s) 
may be calculated using the Genetic Algorithm method of the above noted application; or 
another method. The normalization coefficients are then applied to Lane 1. The aligned 
result of Lane 1 is then systematically correlated to each of the 22 known alleles. 

The correlation takes place on a peak by peak basis as illustrated in Fig. 6. Each peak 
in the aligned patient data stream, representing a discrete sequencing reaction termination 
product, is identified. (Minor peaks representing sequencing artifacts are ignored.) The 
area under each peak is calculated within a limited radius of the peak maxima (i.e. 20 data 
points for A.L.F. Sequencer results). A similar calculation is made for the area under the 
curve of the known allele at the same point. The swath of overlapping areas is then 
compared. Any correlation below a threshold of reasonable variation, for example 80%, 
indicates that a peak is present in the patient data stream and not in the other If one peak 
is missing, then the known allele is rejected as a possible identifier of the sample. 

The reverse comparison is also made: peaks in the known data stream are identified 
and compared, one by one, to the patient sample results. Again, the presence of a peak in 
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one data stream, that is rnot presentin the other, eliminates the -known data stream as an 
identifier of the sample. : 

In Fig. 5, lane 2, for allele DRB 1*0405, has a peak (marked X) not found in the patient 
sample. Peak comparison between aligned lane 1 and lane 2 will fall below threshold at the 
peak marked X. Lane 3 is for part of known allele DRB 1 *040 1 . In this case, each peak is 
found to have a correlate in the other data stream. DRB 1 *0401 may therefore identify the 
patient sample. (The results illustrated are much shorter than the 200-300 nt usually used 
for comparison, so identity of the patient sample is pot confirmed until the full diagnostic 

sequence is compared.) : 

t 

j i 

\ i 
Example 2 

Results are obtained from the patieru sample according to Example 1, above The 

i 

sample results are converted into a "text" file as follows. The maxima of each peak is 
located and plotted against the separation from the nearest other peak (minor peaks 
representing noise are ignored). Fig. 7. The peaks that are closest together are assumed to 
represent single nucleotide separation and an narrow range for single nucleotide separation 
is determined. A series of timing tracks are proposed which attempts to locate all the peaks 
in terms of multiples of a possible single nucleotide separation. The timing track that 
correlates best (by least mean squares analysis) with the maxima of the sample data is 
selected as the correct timing track. The peak maxima are then plotted on the timing track. 
The spaces between the peaks are assumed to represent other nucleotides. A text file may 
now be generated which identifies the location of all nucleotides of one type and the single 
nucleotide steps in between. 

The text file for the patient sample is cbmpared against all known alleles. The 
known allele that best matches the patient sample identifies the sample. 

Sample 3 

For HLA Class If DRB 1 Serological group DR4, 22 alleles are known. A 
hierarchy of single nucleotide sequencing reactions can be used to minimize the number of 
reactions required for identification of which allele is present. Reactions are performed 
according to the methods of example I , above. 
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If it is established from the group specific reaction that only one DRB1 allele is 
a DR4 subtype, then identification of that allele is made by the following steps: 

1. Determine A nucleotide sequence. This identifies 16 of 22 known alleles; 

then 

2. Determine G nucleotide sequence. Identifies 10 of 22 known alleles; then 

3. Combine A and G sequencing results by computer analysis. Identifies all 22 
known alleles. 

If the patient sample is identified at any one step, then the following step(s) 
nee M not be performed for that sample. 

Example 4 

If the group specific reaction in example 1 indicates that two DR4 alleles are 
present in the patient sample, then from the 22 known alleles, there are 253 possible allelic 
pair combinations (22 homozygotes + 231 heterozygotes). Again, a hierarchy of single 
nucleotide sequencing reactions can be used to minimize the number of reactions required 
for identification of which allelic pair is present. Reactions are performed according to the 
methods of example U above. 

1 . Sequence G: Distinguishes among 10 homozygote pairs and 64 
heterozygote pairs. 

2. Sequence A: Distinguishes among 1 6 homozygote pairs and 23 
heterozygote pairs. 

3. Combine A and G sequencing results by computer analysis. Identifies all 
known homozygotes and 169 known heterozygote alleles. 

4. Sequence C: Distinguishes among 5 homozygotes pairs and 18 
heterozygote pair 

5. Combine A, C and G sequencing results by computer analysis. Identifies all 
known homozygotes and 219 heterozygote pairs. 

6.. Sequence T: Distinguishes one homozygote pair and 5 heterozygote pairs. 
7.j Combine A, C, G and T sequencing results by computer analysis. Identifies 
all known homozygotes and 225 heterozygote pairs. 
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8. If at the end of sequencing the A nucleotides, allelic pairs can still not b6 
distinguished, Sequence Specific Oligonucleotide Probesmay be usedto distinguish which 
of the pairs are present, according to the invention. 

If the patient sample is identified at any one step/then the following step(s) 
need not be performed for that sample. 

This example assumes that all alleles will be equally represented among the 
patient samples analyzed. If certain alleles predominate in the population, then it may be 
advantageous to perform reactions definitive for those alleles first, in order to reduce tjie 

4 

total number of reactions performed. ' } 

i 

Example ? ; ! 

Virtually all the alleles of the HLA Class I C gene can be determined on tfee 

! 

basis of exon 2 and 3 genomic DNA sequence alone (Cereb, N et al. "Locus-specific i 
amplification of HLA class I genes from genomic DNA: locus-specific sequences in the 
first and third introns of HLA- A, -B and -C alleles." Tissue Antigens 45:1-1 1 (1995)): The 
primers used amplify the polymorphic exons 2 and 3 of all C-alleles without any co- ; 
amplification of pseudogenes or B or A alleles. These primers utilize C-specific sequences 
in introns I, 2 and 3 of the C-locus. ; 

Identification of alleles in a patient sample is performed according to the ; 
method of example I, with the following changes. Patient sample DNA is prepared 
according to standard methods (Current Protocols in Molecular Biology, Eds. Ausubel, 
F.M. et al, (John Wiley & Sons; 1 995)) 

The following primers are used to amplify the HLA Class 1 C gene exon 2: 

Forward Primer; Intron 1 
Primer Name: C211 

5'- AGCG AGTGCCCGCCCGGCG A - 3' SEQ ID No.: 19 

Reverse Primer; Intron 2 
Primer iSiame: C2RJ2 

5' - Biotin - ACCTGGCCCGTCCGTGGGGGATGAG - 3' SEQ ID NO 20 ; 



i 
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Amplicon si2e 407 bp: 

The amplification was carried out in PCR buffer composed of 1 5.6 mM 
ammonium sulfate, 67 mM Tris-HCl (pH 8.8). 50 uM EDTA, 1 5 mM MgCI2, 0.01% 
gelatin, 0.2 mTVI of each dNTP (dATP, dCTP, dGTP and dTTP) and 0.2 mM of each 
amplification primer. Prior to amplification 40 ng of patient sample DN A is added followed 
by 2,5 units of Taq Polymerase (Roche Molecular). The amplification cycle consisted of: 
1 min 96 C 

5 cycles 96 C 20 sec 

70 C 45 sec 

72 C 25 sec 
20 cycles 96 C 20 sec 

65 C 50 sec 

72 C 30 sec 
5 cycles 96 C 20 sec 

55 C 60 sec 

72 C 120 sec 

In a separate reaction, exon 3 of HLA Class I C is amplified using the following 
primers: 

Forward primer; intron 2-exon 3 border 
Primer name: C312E3 

5' Biotin - GACCGCGGGGCCGGGGCCAGGG - 3' SEQ ID NO.: 21 

Reverse primer; intron 3 
Primer name: C3RI3 

5' - GGAGATGGGGAAGGCTCCCCACT - 3' SEQ ID No.: 22 

Amplicon size 333 bp. 

The same reaction conditions as listed for exon 2 are used to amplify the DNA. 

Sequencing reactions are next performed according to the method of example 1 using 
one of the following 5' fluorescent-labeled sequencing primers: 



Exon 2: 
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Forward sequencing 

5* - CGGGACGTCGCAGAGGAA - 3 ( (lntron 3) SEQ ID No.: 25 

Exon 2: 

Reverse sequencing 

5* - GGAGGGTCGGGCGGGTCT - 3' (lntron 2) SEQ ID NO.: 24 

Exon 3: 

Forward sequencing 

5' - CCGGGGCGCAGGTCACGA - 3' (lntron 1) SEQ ID NO.: 23 

® 

The termination reaction selected depends on whether a forward or reverse primer is 
chosen. Appendix 1 lists which alleles can be distinguished if a forward primer is used (i.e. 
sequencing template is the anti-sense strand). If a reverse primer is used for sequencing, 
the termination reaction selected is the complementary one (A for T, C for G, and vice 
versa). 

Homozygotic alleles of HLA Class I C are effectively distinguished by the following 
sequencing order: 

1 . Determine sense strand A nucleotide sequence. Identifies 24 of 35 known 
homozygotes; then 

2. Determine sense strand C nucleotide sequence. Identifies 16 of 35 known 
^ homozygotes; then 

3. Combine A and C sequencing results by computer analysis. Identifies 31 of 35 
known homozygotes; 

4. Determine sense strand G nucleotide sequence. Identifies 14 of 35 known 
homozygotes; then 

5. Combine A, C and G sequencing results by computer analysis. Identifies 33 of 35 
known homozygotes. 

The remaining 2 alleles, Cw* 1 2022.hla and Cw* 1 202 1 .hla can not be distinguished by 
nucleotide sequencing of only exons 2 and 3. Further reactions according to the invention 
may be performed to distinguish among these alleles. 
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Ifihe patient sample is Identified at any one step, then the following step(s) need not 
be performed for that sample. 

Heterozygotes are analyzed on the same basis; the order of single nucleotide 
sequencing reactions is determined by Peking which reactions will distinguish among the 
greatest number of samples (data not shown), and performing those reactions first 

This example assumes that all alleles will be equally represented among the patient 
samples analyzed. If certain alleles predominate in the population, then it may be 
advantageous to perform reactions definitive for those alleles first, in order to reduce the 
total number of reactions performed. 



Ex ample 6 

One lipoprotein. lipase (LPL) variant (Asn291 Ser) is associated with reduced 
HDL cholesterol levels in premature atherosclerosis. This variant has a single missense 
mutation of A to C at nucleotide 1 127 of the sense strand in Exon 6. This variant can be 
distinguished according to the instant invention as follows. 

Exon 6 of the LPL gene from a patient sample is amplified with a 5' PCR 
primer located in intron 5 near the 5 1 boundary of exon 6 



(5 -GCCGAGATACAATCTTGGTG- 3') 



[Seq ID No. 26] 



The 3' PCR primer is located in exon 6 a short distance from the Asn291Ser mutation and 
labeled with biotin. 



(S'-biotin- CAGGTACATTTTGCTGCTTC - 3'). 



[Seq ID No. 27] 



PCR amplification reactions were performed according to the methods detailed in Reymer, 
PWA., et aL "A lipoprotein lipase mutation (Asn291Ser) is associated with reduced HDL 
cholesterol levels in premature atherosclerosis." Nature Genetics 10; 28- 34 (1995). , 

Sequencing analysis was then performed according to the Thermo Sequenase™ 
( Amersham) method of example 1 , using a fluorescent-labeled version of the 5' PCR primer 
noted above. 
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.Since the deleterious allele has a C at nucleotide 11 27 of the sense strand, the C 
termination sequencing reaction was performed The results of the reaction were recorded 
on an automated DNA sequencing apparatus and analyzed at the ] 127 site. The patieni 
sample either carries the C at that site, or it does not. If a C is present, the patient is 
identified as having the "unhealthy" allele. If no C is present, then the "healthy" form of the 
allele is identified. Patient reports may be prepared on this basis. 

Example 7 

Health care workers currently seek to distinguish among Chlamydia trachomatis 
strains to determine the molecular epidemiologic association of a range of diseases with 
infecting genotype (See Dean, D. et al "Major Outer Membrane Protein Variants of 
Chlamydia trachomatis Are Associated with Severe Upper Genital Tract Infections and 
Histopathology in San Francisco." J. Infect. Dis. 172:1013-22 (1995)). According to the 
instant invention, the presence and genotype of pure and mixed cultures of C trachomatis 
may be determined by examining the C, trachomatis ompl gene (Outer Membrane Protein 
1). 

The ompl gene has at least 4 variable sequence ("VS") domains that may be used to 
distinguish among the 15 known genotypes' (Yuan, Y et al. "Nucleotide and Deduced 
Amino Acid Sequences for the Four Variable Domains of the Major Outer Membrane 
Proteins of the 15 Chlamydia trachomatis Serovars" Infect. Immun. 57 1040-1049 
(1989)). Logically, to determine presence of a genotype in detectable amounts in a possibly 
mixed culture, the technique must search for a nucleotide which is unique among the 
genotypes at a specific location. For example, genotype H has a unique A at site 284. No 
other genotype shares this A, therefore it is diagnostic of genotype H. Other genotypes 
have other uniqu^ uucleotides. On this basis, a preferred order of single nucleotide 
sequencing may be determined, as follows. 

Patient samples were obtained and DNA was extracted using standard SDS/Proteinase 
K methods. The sample was alternatively prepared according to Dean, D et al. 
"Comparison of the major outer membrane protein sequence variant regions of B/Ba 
isolates: a molecular epidemiologic approach to Chlamydia trachomatis infections." J. 

f : 

Infect. Dis 166: 383-992 (1992). In brief,, the sample was washed once with IX PBS, 
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centrifuged at ;i4,6b0gViTWuspiended in dithiothreitbl and TRIS-EDT A buffer, and boiled 
before PCR. One microliter of tlie sample was used in a 1 00 microliter reaction volume 
that contained 50 mM KG, lOmM TR1S-CI (pH 8.1), I ;5 mlvTMgCG, 100 micromolar 
(each) dATP, dCTP, dGTP, and'dTTP, 2.5 U of ampli-Taq DNA polymerase (Perkin- 
Elmer Cetus, Foster City, C A), and 1 50 ng of each primer. The upstream primer was Fl 1 : 

5' - ACC ACTTGGTGTGACGCTATC AG - 3' [Seq ID No. 28] 

(base pair [bp] position 1 54-1 76?), 

; I 

and the downstream primer wa^ B 1 1 : 

5' - CGGAATTGTGCATTTACGTGAG - 3* [Seq ID No. 29] 

i 
i 

} 



Opposition 1187-1166).' 1 



The thermocycler temperature profile was 95 degrees C for 45 sec, 55 degrees C for 1 
min, and 72 degrees C for 2 mini with a final extension of 10 min at 72 degrees C after the 
last cycle. One microliter of the: PCR product was then used in each of two separate nested 
1 00 microliter reactions With primer pair: 
MF2I * 

5' - CCGACCGCGTCTTGAAAACAGATGT - 3' [Seq ID No. 30], and 
MB22 

5'- CACCCACATTCCCAGAGAGCT - 3' [Seq ID No. 31] 

which flank VS1 (Variable Sequence 1) and VS2, and primer pair 
MVF3 

5' - CGTGCAGCTTTGTGGGAATGT - 3* [Seq ID No. 32], and 
MB4 

5' - CTAGATTTCATCTTGTTCAATTGC - 3' [Seq ID No. 33] 
which flank VS3 and VS4 (see DeanD, and Stephens RS. "Identification of individual 
genotypes of Chlamydia trachomatis in experimentally mixed infections and mixed 
infections among trachoma patients." J. Clin. Microbiol. 32:1506-10 (1994).) These 
primer sets uniformly amplify prototype C trachomatis serovars A-K and LI -3, including 
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Ba, Da, la, and L2a. A sample of each product ( 1 0 microliters) was run on a 1 . 5% agarose 
! gello confirm the size of the amplification product. All PCR products were purified 

(GeneClean II; Bio 101, La Jolla, CA) according to the manufacturer's instructions. 

All samples that were positive for presence of C. trachomatis by PCR were subjected 
to omp 1 genotyping by single nucleotide sequencing. Amplification for sequencing 
reactions was performed as above using at least one of the above noted amplification 
primer pairs, with a 5' biotinylated version of either one of the primers, 
j The biotinylated strand was separated with Dynal beads and selected termination 

; reactions were performed as in Example 1 using a 5' fluorescent labeled version of MF21 or 

; \ MVF3. 

\ \ 

i \ The selection of termination reactions depends on the degree of resolution among 

I I genotypes desired. Only 1-3% ^ clinical C. trachomatis samples contain mixed genotypes 

\ Nonetheless, other pathogens are more commonly mixed, such as HIV, HPV and Hepatitis 

C. For all these organisms, it is important to have a method of distinguishing heterogenous 

\ samples. 

The first 25 nt of the T termination reaction for C. trachomatis VS1 can be used to 
> distinguish among 3 groups of genotypes, as illustrated in Fig. 8A. The observed results 

i for Sample 1 in Fig. 8 A demonstrates that detectable levels of at least one of Group 1 and 

at least one of the Group 3 genotypes are present. Group 2 is not detected. 

If a higher degree of resolution is required, then further reactions are necessary. To 
distinguish among possible Group Is, the VS ! A reaction is performed. Fig. 8B illustrates 
; possible A results. The observed results of Sample 1 shows an A at site 257. This A could 

be provided by only E, F or G genotypes. Since the T track has already established the 
absence of both F and G, then E must be among the genotypes present. Further, the 
j absence of an A at 283 indicates that neither D nor F nor G are present. The presence of F 

and the absence of D, F and G may be reported. 

Other Group 1 genotypes may be present in addition to E; they do not appear because 
their presence is effectively masked by E. Other single nucleotide termination reactions can 
be performed to distinguish among these other possible contributors, if necessary. The 
investigator simply determines which single nucleotide reaction will effectively ^distinguish 
among the genotypes which may be present and need to be distinguished. 
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\ 

Alternatively, Sample 2, which showed the presence of Group 1 orijy In .the T reaction 
is shown to be comprised of only Ba genotype because of an absence of A at 268. This 
shows that both the presence and absence of nucleotides can be used to determine the 
presence of some genotypes in some circumstances. 

The first 25 nt of C and G termination reactions for VS1 only are included in Tig, 8C 
to show how an investigator can determine which reaction to select and perform. If higher 
degrees of resolution are required, the termination reactions for VS2, VS3 and VS4 may be 
performed. \ 

Not only the genotype, but also variants of D, E, F, H, 1 and K genotypes (as disclosed 
in Dean, D. et al "Major Outer Membrane Protein Variants of Chlamydia trachomatis Are 
Associated with Severe Upper Genital Tract Infections and Histopathology in San 
Francisco." J. Infect. Dis 172:1013-22 (1995)) may be distinguished by using the above 
single nucleotide sequencing method. ' 

EXAMPLE 8 \ 
The allelic frequencies of HLA Class I C are distributed among Canadians as 

follows: 

Cwl 5.5 
Cw2 4.4 
Cw4 10.0 

Cw5 6.4 ^ 
Cw6 9.4 
Cw7 28.9 

Cw9 7.2 • 
CwlO 5.7 
Cwll 0.5 
Unknown/other 22.0 

On the basis of this data, for a Canadian sample, it is preferable to perform termination 
reactions that preferentially distinguish homozygotes and heterozygotes containing a Cw7 
allele (i.e. Cw*0701 to Cw*0704) first. This should be followed by, Cw4, Cw6 and Cw9, 
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etc. *Gw7 is preferentially' distinguished onthe basis of C/G analysis (J22 out of 134 
possible combinations, See Appendix 2). (Plus a further 320 out of the remaining 496). 
Cw4 is also preferentially distinguished onthe basis of C/G analysis (57 out of 69) (with a 
further 385 out of the remaining 56 1 ). i nus thepreferred order of termination reactions is 
as follows: 

1 . Determine sense strand C nucleotide sequence for patient sample exon 2 and exon 3; 

2. Determine sense strand G nucleotide sequence for patient sample exon 2 and exon 3; 
then j 

3. Combine G and C sequencing resul: by computer analysis to identify 442 out of 630 
possible combinations, including 179/195 possible allelic pairs containing at least one Cw7 
or Cw4 allele (38.9% of Canadian population). 

4. Determine sense strand A nucleotide sequence for exons 2 and 3; 

5. Combine A, C and G sequencing results by computer analysis. Identifies remaining, 
undetermined heterozygoses. 

The only combinations that can not be distinguished after this point include 2 
remaining alleles, Cw* 12022 and Cw* 12021, which can not be distinguished by nucleotide 
sequencing of only exons*2 and 3. Further reactions according to the invention may be 
performed to distinguish among these alleles. Note that since these alleles differ only at a 
silent mutation, they are identical at the amino acid level, and do not need to be 
distinguished in practice. Sample reports can simply confirm the presence of the one allele ' 
plus either of Cw* 1 2022 or * 1 202 1 . 

If the patient sample: is identified at any one step, then the following step(s) need not 
be performed for that sample. 

■ EXAMPL E 9 

Analysis of the HLA-DRB 1 allelic type of a sample may be performed according to 
Example 1 using two chain terminating nucleotides. 100 ng of patient sample DNA 
(previously amplified as in Example I) is combined with labeled sequencing primer: 
5' - GAGTGTC ATTTCTTC AA - 3' [SEQ ID NO. 1 8] 

(30 ng (5 pM total)); in 2X sequencing buffer (52 mM Tris-HCI, pH 9.5, 13 mM MgC12); 
and 2 U of Thermo Sequenase enzyme (Amersham Life Sciences, Cleveland) in a final 



BNSOOCID: <WO 9723650A2_1_> 



WO 97/23650 



PCT/US96/20202 



-32- 

volume of 3 ul. This sequencing pre-mix is kept on ice until ready to use, and then 
combined with 3 ul of one of the following termination mixtures: 

A/C termination reaction: 
750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddATP; 2.5 uM ddCTP 

A/G termination reaction: 
750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddGTP; 2.5 uM ddATP 

Total termination reaction volume: 6 ul 

The termination reaction mixture is thermal cycled in a Robocycler for 30 cycles (or fewer 
if found to be satisfactory): 
95 C . 40 sec 
50 C 30 sec 
68 C 60 sec 

After cycling 12 ul of loading buffer consisting of 100% formamide with 5 
mg/ml dextran blue is added to the termination reaction mixture, and an appropriate volume 
(i.e. ] .5 ul) is loaded on to an automated DNA sequencing apparatus, such as a Visible 
Genetics OPEN GENE™ System. 
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Append^ i } 

HLA Class I C locus: allele analysis on the basis of exons 2 and 3. 

Sequences obtained from the Strasbourg Data Base 

interne t A ddrm ^ f tp;//FTP>gMB^ Hctdg»bgr gt DE/pyib/ d atabasgs 

35 known alleles for HLA Class F C locus. j 

1: Cw*0101.hl« 18: Cw«0801ihla 

2: Cw*OI02.hla 19: Cw*08021hla 

3: Cw«0201.hla 20: Cw»0803lhla 

4: Cw*02021.hla 21; CwM20llhla 

5: Cw«02022.hla 22: Cw»1202i.hla 

6: Cw*03M.hla 23; CwM2Q2k.hla 

7: Cw«0302.hla 24: CwM203lhla 

8: Cw*0303.hla 25: Cw*1 30l' f hla 

9: Cw«0304.hla 26; Cw*14Q?. h» a 

10: Cw«040I.hto 27: Cw«1403lhla 

11:Cw«0402.hla 28: CwMSOllhla 

12: Cw«0501.hla 29: Cw«1502.hla 

13: Cw*0602.hla 30: Cw«I503:hla 

14: Cw«0702.hla 31: Cw*15Q5.hla 

15: Cw*0701.hla 32: Cw*150^.hla 

16: Cw*0703.h| a 33; Cw«1601,hte 

17: Cw*0704.hla 34: CwM602.hla 

35:Cw«l701.hla 



35 alleles m ay be co m b ine d as 35 ho mozygou s pair s or 630 heterozygous pairs. 

Homozygous pairs may be distinguished by single nucleotide sequencing in the 
following order: 
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Non-IJniquc Sequences using A: 

Cw«0102.hla = fCw«0102.hla) 
Cw«0101.hla = fCw«0101.hlal 
Cw«0701.hi a = fCw«0701.lila) 

C^'07Q2 ,h»a = (O*Q702 ,hIa> 

Cw«12022.hla = fCw«12022.hla. Cw«1203.hlal C w *12021.hla = fCw«12n21 hln 
Cw«1203.hlal Cw*12021.hla = tCw*\ 2021.hla. rw«l2022,hl«) Cw«lS03.hl a = 
/Cw*1S03.hla> 

Cw«lS02.hla = fCw«1502.h1al 
Cw»1504.hla = fCw*1504.hlal 
Cw«1505.hla = tCwM505.hlal 

Unique Sequences using A: 

1: Cw»0201.hla 13: Cw*0704.hla 

2: Cw*02Q21,hla 14: Cw»0801.hla 

3; Cw*0202?,hla 15: Cw«0802.hla 

4: Cw»0301.hla 16: CW0803.hla 

5; Cw*0302,h»a 17: Cw«1201.hla 

6; Cw«0303,hla 18: Cw«1301.hla 

7; Cw*0304,tila 19: Cw*1402.hl« 

8: Cw»0401.hla 20: Cw*1403.hla 

9; Cw*0402,hl« 21; Cw'ISOl.hla 

IP; Pv'QSOl.hto ■ 22:Cw«1601.hla 

ll:Cw*0602.hhi 23: Cw«1602.hla 

12: Cw«0703.hla 24: Cw»1701.hla 



Non-Unique Sequenc es using C: 

Cw <, 0?Q22;hl« = (Cy*9202?,>ila) CwM503.hla = (Cw*1503.hla. 

Cw*Q202| t hla = (Cw*0202l,Ma) Cw«lS0S.hl fl lC W «1502.til fl = 

Cw«0304,hla = (Cw*0304.Ma) (CwM502.hla. CwMSOS.hla) 

;Qy*Q303,Wa = (Cw*0303,hla) CwMS02.hla = fCw«lS02.hla. 

Cw*0802,hla - fC^*0802.h|a ) Cw«1503.hlal CwM203.hla = 

Cw«0803.hla = fCw«0803.hlal fCw«1203.hlal 

CWOSOl.hla = (Cw«0501.h»al Cw«1602.hla = (Cw*1602.hlal 

C^0801,hla = fCw*08Ql.hla) Cw*1601.hla = (Cw»1601.hla) 
Cw«12022.hla = fCw«12022.hlal 
Cw*12021,hla = <CwM2021.hlal 
Cw«lS04.hla = fCw«1504.yi»al 
Cw«1403.hla = fCw«1403.tilal 
Cw*1402.hla = fCw»1402.nlal 
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Unique Set iPfw** """" C: 

l- rw*O101.hla 
•>• rw*0102.hla 
\ r«*fl20l.hla 
a - rw«0301.hta 

(- w *0302.hla 

rw*0401.hta 



7- rw«Q402.hla 
b ; r w *O602.hla 
q- r w «0702.hla 
in ; r>*n7D1.hla 
ji» rw*0703.hla 
i-»; f\y*0704.hla 



iv-TwMMl.hla 

id rw*i30i:hia 

yft rw«1501.hla 
1A; f>*17Q1.hla 



r«»n?n22.hla = {r w «n2n22.hla) 

hl» =V ~«"™7 hi*- fa'"^ rw*M01 hla, Q«0803,Ma. 

^^^ y^l .rv.amyi rv-n i N„ Cw«08Q3,„,a, 

r-l ^n, h" - /C n «™» ft rw«030Thla Cw«03"4.h|a Cw«0803,h| a , - 

fcsSSt^i^^7M; 7^17^ rw*r 03 M,, rw» 30 A a 
^■ndii 1.1, - rr ^" f» hln f> N707? hla, Cw 1301 ,fc a 

^^iZ^iS^ ^^ - '™M7 hla. O«0303,hla, 
r w «n304 hla. Q «M»" M» Cw«08<n hla, CwM602,hla) 
r w »n3n2.hi a = f^nam.hi.. C w « 03Q3,hla, Cw*0?04 Ma. TWOSOLhla, 
rw-oaM.hla, rw» 1601 hla) 



irw*moi.hia • cwow.hia 



r w *02fll.hla 
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4: C W »0301.hla 6: Cw*0501.hh 

5: C W «0402.hla 7: Cw«0602.hla 

8: Cw»0702.hla 
9: Cw*0701.hla 
10: Cw"0703.h»a 



ll:Cw*0704.hla 
12:Cw»1201.hta 
13: O«1fi03.hta 

M; Cw>poi,hia 



Non-Unique Sequences using T: 

C w «0102.hla = (Cw«0102.hlal 
Cw»0101.hla = (Cw»0101.htal 

Cw«02021.hla ~ .Cw*02021.hla. Cw*02022.hlal Cw»0201.hla = fCw«0201.hla. 
Cw"02022.hlal Cw*0201.hla = fCw*020I.hla. Cw»0202I.htal Cw«0303.h»a = 
fCw"0303.hla. Cw*0304.h1a) Cw*0302.hla = (Cw»0302.hla. Cw*0304.hla) O 
Cw«0302.hla = (Cw«0302.hla. Cw*03Q3.hla> Cw*0402.hla = (Cw*0402.hla) 
C W »0401.hla = (Cw*0401.h»al 

Cw»0801.hla = (Cw«0801.hla. Cw«0802.hla. Cw«0803.hlal Cw»0701.hla = 
fCw«0701.hlal 

Cw"0702.hla = rCw»O702.hla) 

Cw*O501.hla ■ (Cw«O501.hla. Cw«0802.hla. Cw«0803.hla) Cw«0501.hla - 
(Cw«0501.hla. Cw«0801.hla. Cw*0803.h!a> Cw«050!.h»a = fCWOSOl.hla. 
Cw*0801.hla. Cw*0802.h»a^ CwM2022.hta = <CwM2022.hla. Cw«1301.h»a) 
Cw«12021.hla = fCw»12021.hla. Cw'1301'.hlrt CwM2021.hla = fCw«12021.hta. 
CwM2022.hlal Cw«1403.hla = fCw«1403.hla^ 
Cw«1402.hla = fCw*1402.htal 

Cw*lS03.hla = (CwMS03.hla. CwMSOS.hla) Cw*1502.hla = fCw*1502.hIa. 
CWlSOS.hla^ Cw*1502.hla = (Cw*lS02.hla. Cw"1503.hta) Cw«1602.hla = 
(Cw«1602.hlal 

Cw»1601.hla - (Cw»1601.hlal 
Unique Sequences using T: 

l:CW0301.hla 4: Cw«0704.hla 7; Cw«1501,hta 

2: Cw«0602.hla 5: CwM201.hla 8; C w*l W4 . W a 

3: Cw*0703.hl« 6: CwM203.hla 9: Cw«1701.hia 



Non-Unique Sequences using AC: 

Cw«12022.hl« = fCwM2022.hlal 
CwM2021.hla = fCw«12021.h»al 
Cw*lS03.hla = fCW1503.hla^ 
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r w M502.hlH = fCwM502,hla) 
Unique Sequences using AC: 



1: Cw*0101.hla 


12: 


Cw*0501.hla 


23: Cw*J301,hlfl 


2: Cw*0102.hla 


13: 


Cw*0602.hla 


24;Cw*1402,hIa 


3: Cw*0201.h!a 


14: 


Cw*0702.hla 


25: Cw*1403.hla 


4: Cw*02021.hla 


J5: 


Cw"070t.hla 


26; O*1501 t hla 


5: Cw*02022.hla 


16; 


Cw*0703.hla 


27; Oy*15Q5,h|a 


6: Cw*030I.hla 


17; 


Cw*0704.hla 


28: CwM504.hl» 


7: Cw*0302.hla 


18: 


Cw*0801.hla 


29: Cw«1601,l|lfl 


8: Cw*0303.hla 


19; 


rw*08„i.hla 


30: Cw*1602.hla 


9: Cw*0304.hla 


20; 


Cw*0803.hla 


3J;Cw*1701,h|a 


10: Cw*0401.hla 


21; 


Cw* 1201 .Ma 




1 1 : Cw*0402.hla 


22: 


Cw*I203.hla 





Non-Unique Sequences using AG: 

Cw«12022.1ila = <Cw*12022.hla. CwM203.hla) Cw«1202 1.hla = <Cw«12021.hla. 
Cw*1203.hlal Cw*12021.hla = (Cw«12021.hla. CwM2 fl22 h| a l rw*1S04.hla = 
fCw*1504.htal 

Cw*1505.hla = rCw»150S.hla1 
Unique Seq uences using AG; 



1: Cw*0101.hla 
2: Cw*0102.hla 
3: Cw«0201.hla 
4: Cw«02021.hla 
5: Cw*02022.hla 
6: Cw»0301.h»a 
7: Cw«0302.hla 
8: Cw*0303.hla 
9: Cw*0304.hla 
10: Cw«04Ui.hla 
1 1 : Cw*0402.hla 
12: Cw"0S01.hla 
13: Cw«0602.hla 
14: Cw*0702.hla 

}S; C>v*Q7Ql,hla 

16: Cw«0703.hla 
17: Cw«0704.h1a 
18: Cw»0801.hla 



19: Cw*0802.hla 
20: Cw*0803.hla 
21: Cw«1201.hla 
22: Cw*1301.hla 
23: Cw«1402.hla 
24: Cw*1403.hla 
25: CwMSOl.hla 
26: Cw«lS02.hla 
27: Cw*1503.hla 
28: Cw*1601.hla 
29: Cw«1602.hla 
30: Cw*1701.hla 
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"Non-Unique Sequences using AT: 
Cw»0102.hla = rCw«0102.hlat 
Cw«0101hla = <Cw«0101.hla> 
Cw«0701.hla = fCw«0701.hlal 
Cw«0702.hla = /Cw«0702.hlal 
Cw«12022.hla = fC W «12022.hIal 
Cw«12021.hla = rCw*12021.hlal 
Cw«1503.hla'= tCW 1503.hlal 
CwM502.hla' = JCw«1502.hlal 



Unique Sequences using AT; 

< t 

1:-Cw*0201.hi« 
2: Cw*0202t.hla 
3; Cw*020^ 2.Ma 
4: Cw"0301.hla 
5: Cw«0302.hla 
6: Cw»0303.hla 

7;C^0304Jh»n 
8: Cw»0401.1ila 
9: Cw«0402.hla 



10: Cw*0501.hla 
11:Cw«0602.hla 
12: Cw«0703.hla 
13: Cw«0704.hla 

14; Cw«Q8Ql.hto 

IS: Cw»0802.hla 
16: Cw«08Q3.hla 
17:Cw«1201.h»a 
18: Cw*1203.hla 



19: Cw* 1301. Ma 
20: Cw«1402.hla21: 
Cw«1403.hla 
22:Cw»1501.hla 
23: Cw*1505.hla 
24: Cw« 1504.hla 
25: Cw«1601.hla 
26: Cw*1602.hla 
27: Cw«1701.hla 



Non-Unique^ Sequences using CG: 
Cw»02022.hla = fCw«02022.hlal 
Cw«02021.hla = fCw«02021.hlal 



Cw*0303.hla = 


fCw*0303.hla) 


Cw*0803.hh = 


(Cw*0803.hlal 


Cw*0801,hla = 


(Cw'OSOl.hlal 


Cw*12022,h|a = 


= fCwM2022.hla> 


CwM2021.hla 


= rCw*12021.hla> 


Cw-H03,Ma = 


(Cw*H03,h|a) 


Cw*1402.hla = 


fCw-1402.hlal 


Cw*l505,h| a = 


fCw 1505,hla) 


CwM502.hla = 


(Cw*1502.hla> 


Cw*1602,h|a = 


fCw*1602.hla> 


CwM601.hl* = 


(CwM601.hla) 


Unique Sequences using CG: 



1: Cw'OlOl.hla 



2: Cw»0102.hla 



3: Cw"0201.h1a 
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d rw«0301.Wa S: (VOSOI.hla ] , 5:Cw'l?0t,hla 

r w «0302.hla * rw*0602.hla 16;Cw«1203,h l a 

rw«0401.hla 10;Cw«0702,hla 17;Cw«1301.hl» . 

7- Cw«0402.hla 11;Cw*0701 t hla 18: O'lSOl.M* 

l?-rw«0703.hla 19; CwM503,hta 

p: r w «0704.hla 2ff; Ov«1504,hja 

\A- rw*0802.hla 21; Cw*1701,hla 



Nnn-llniquf Sequences using CT; 

r w «oin22.hia = fr w «02022.hlal 

r w «02021.H a = fr w *02021.hlal 

r w *o304.hia = f C w *"3 p4 - nl »> 

riy«»303.hlq = Jr w «0303.hlal 
r w «0802.hla = frw*0802.hla) 
r w «0803.hla = (rw*0803.Wa^ 

r w «o«;oi.hia = (Cw*0501.hla) 
fw^nsni.hia = rrw*n80i.hiai 

r w *12022.h h = (rw«12022.hla> 
r w «i2fl2i.M n -- /r w «i202l.hla) 
rw*1403.hla = /r w «1403.hlal 

r w «1402.hla = rr w «1402.hla) «-„-,., 

rw«iso3.hia = rrw*iso3:hla , r w «isos.hiai Cw«l502,hta = (OM5Q2,hto. 
rwifiofi.hiai rw*i^n2.hi a = frw«i502.hi a . Cw«i503.hlat OY«i602,hta - 
rfrw«1602.hla) 

rw i fini .hia = <rw« 1 601 .hiai 

Unique Seq yepr^ "*inff CT: 

l-Cw«0101.hla 7- rw«0402.hla 13; O*1201,Ma 

?• rw*Q102.hla 8; Q«06QZ.hla 14; CwM203,h»a 

3: r w *0201.hla 9- rw*0702.hla IS; Cvy«1301,Ma 

4: Cw«0301. hla 1ft: rw«0701.hla 16; Q'lSOl.hla 

* rw«0302.hia iM.WQ7Q3.hla i7;Tw*lS°4ihla 

Cw*0401.hla 12: CW0704.hla 18; Cw«1701,h1a 



Nnn-Hniqw p Sequences usinfl GT; 

r w «02022.hla = I r w «02022.hlal 
rw«02021.h la = fr w «Q2021.hlal 
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rw0303.hla = fC W «0303.hl a. Cw«0304.hlal Cw»0302.hla = fCw«0302.hla. 
Cw"O304.hlaVCw«n302.hla~ /Cw«0302.hla. Cw*0303.hlal Cw«0803.hla = 
fCw«0803:hlal - ■ < 

Cw»0801.hla = /Cw«0801.hl«) 

Cw«12022.hla = fCw«12022.hla. Cw«1301.hlal Cw«12021.hla = (Cw«1 2021 hla. 

Cw«1 301. hlaVCwM 2021. hla = fCw* 12021. hla. CwM2022.hlal Cw«1403.hla » 

fCw«1403.hlaV 

Cw*1402.hla = fCw"1402.hlal 

rwiSOS.hla = (CwM505.hla> 

rw»1 502:hla = i Cw*1 502.hlal \ 

rw»1602.hla = fCw«1602.hlal \ 

Cw*l 601. hla = fCw«1601.hla) \ 



Unique Se quences using GT: 

1: Cw»0101.hla 
2: Cw«0102 hla 
3: Cw«0201.hla 
4; C W «0301.hla 
5: CW0401.hla 
6: Cw«0402.hla 
7: C W «0501.hla 



8Lrw*0602,h)a 

10: Cw«0701.hla 
11:Cw«0703.hla 

1 2; C^V t Q704,h »« 

13: Cw*0802.hla 
14: CwM201.hla 



15; Cw*1203,hla 
16: Cw* 1501 .hla 
17: Cw»1S03.hla 

18; Cw*15Q4,hla 

19: Cw* 1701. hla 



Non-Unique Sequences using ACG: 
Cw»12022.hla = fCw«12022.hlal 
Cw«12021.hla = (Cw»12021.hlal 

Unique Sequences using ACG; 



1: Cw*0101.hla 
2: Cw»0102.hla 
3: Cw*0201.hla 
4: Cw»0202i:hla 

5; Cw*02Q22,hto 
6: Cw«030i:hla 
7: Cw*0302.hla 
8: Cw«0303.hla ' 
9: Cw«0304.hla 

10; Cw*Q4Ql.hl« 

11:Cw«04O2.hla 
12: Cw«0501.hla 

13; C w«Q602 .hta 

14: Cw«0702.hla 



15: Cw«0701.hla 
16: Cw«0703.hla 

17; Cw *Q7 Q4. h»a 

18: Cw«0801.hla 
19:Cw*0802.hla 
20: Cw«0803.hla ' 
21: Cw« 1201. hla 
22: Cw«1203.hla 
23: Cw«1301.hla 
24: Cw«1402.hla 
25: CwM403.hla 



26: Cw»lS01.hla 

27; CwMSQ2,hla 

28:Cw«1503.hla 
29; Cw »1 5 0 5,h la 
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•M; f W «1504.hl« 

ti; rw*i601.hla 
r>- r w «l602.hla 
iv rw*1701.hla 



Npn-Hniqu f Sequences using ACfc 



r« v«i7.ftM.hln = (r w *no22.hl») 
r^«nri21.hla = fr w «12021.hla) 

r w «ifirt3.hlii «f r W* 1S03 - hh> * 



jinj n iiefSeq nr"T« "<»"ff ACT: 



i; r w «6ioi.hia 
? ; r w *6i02.h»a 
v, rw*020l.hla 
4- rw*6202!.hla 
^ ; rw*02022.hla 
ft ; rw«b301.hla 
7; Cw«n302.hla 
a- rw*63Q3.hla 
Pi; Cw«0304.hla 
f fl; Cw*0401.hla 
ii; rw«0402.h»a 



17 ; Cw'OSOl.hla 
i v , Cw*0602.hla 
14; Cw«0702.hla 
^ » r w *0701.hla 
ift; r w *0703.hla 
17; rw«0704.hla 
iR;rw*0801.h>a 
p. r w »0802.hla 
?n ; r w *0803.hla 
7^; rw«l201.hla 
?•> ; Cw*1203.hto 



CW1301.hIa 
74; Cw»l402.hla 
7 ft rw«1403.hla 
7ft ; Cw«1501 hla 
77 ; rw«1505.hla 
7« ; r>«1504.hla 
7»; r w *l601.hla 
in- fwM602.hla 



Nnn-llniQUP Sequences using AGT.; 
rw»12022.hl ii = (r w «12022.Mal 
r w « 11021 .hln = ff>«i 2021 .Mai 



Unique Sea vfnrff »*«"E AGT; 

7-r w *o.o2: n i a iO;Ov'040! t h)a o r C h 

^MlOLhla 11;f>«0402,h)E Pi O«0802, h la 

4 : rw«n202i.hia U; O«050i,hi a 

s- rw*02022.hla 13; O'0602,hla 

ft ; rw*03Ql.hla M; Cw*0702 t fr|a 

7- rw*0302.hla 1S;f>«0701.|ila 

» ; rw"03Q3.hla 16; Q'W.tlto 
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20: Cw*0803.hla 
21; Cw» 1201. hla 
22: Cw«1203.hla 
23: Cw«1301.hla 
24: Cw*l402.hla 
25: Cw*I403.hla 
26: Cw» 1501. hla 
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27: Cw*1502.hla 
28: Cw»1503.hla 
29: Cw«1505.hla 
30: Cw*lf » 4.M» 
31: Cw« 1601. hla 
32: Cw*1602.hta 
33: Cw«1701.hla 



Non-Uniqu e Sequences using CGT: 

Cw«02022.h»a = fCw"02022.hlal 
rw02021.hla = fCw«02021.hlal 
Cw«0304.hla = <Cw«0304.hla1 
Cw»0303.hla = tCw"0303.hlal 
rw«0803.hla = /Cw«0803.hla> 
rw*0801.hla = (Cw*0801.hlal 
r w *12022.hla - fCwM2022.hla) 
Cw» 12021 .hla = fCw»12021.hla) 
Cw«1403.hla = fCw*1403.hlal 
C w «1402.hla = fCw«1402.hla) 
CwM50S.hla = <Cw«1505.hla) 
C w «1502.hla = fC W «1502.hla> 
Cw«1602.hla = fCw»1602.hla) 
Cw« 1 60 1 .hla = f Cw* 1 60 1 .hlal 

Unique Sequences usi ng CGT: 

1: CWOIOl.hla 8: Cw«0501.hla 

2: Cw*0102.hla 9: Cw*0602.hla 

3: Cw«0201.hla 10: Cw«0702.hta 

4: Cw«0301.hla 11: Cw*0701.hia 

5: Cw'0302.hla 12: Cw«0703.hla 

6: Cw»0401.hla 13: Cw*0704.hla 

7: Cw«0402.hla 14; C W «08Q3,h,la, 



i 



i 




$ 

' ! 

1 



15: Cw«1201.hla 
16: Cw«1203.hia 
17: Cw*i301.hla 
18: CW 1501. hla 
19: Cw"1S03.hla 
20: Cw*1504.hla 
21: 701.1.:,. 
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N on-Unique Sequences using ACGT; j 
Cw"12022.hla « /Cw«12021.hl»l 
Cw«12021.hla « (Cw*12022.hlal 

Unique Sequences using ACGT: 



1: Cw*0101.hin 
2: Cw*0102.hla 
3: Cw«0201.hla 
4; CyQ2Wi,Wn 
5j_C" *02022 ,hla 
6: Cw»0301.h»a 
7: Cw»0302.hla 
8: Cw»0303.hla 
9: Cw«0304.hla 
10: Cw«0401.hla 
11;Cw*0402, hh 



12: Cw'OSOl.hla 
13: Cw*0602.hla 
14: Cw*0702. ||h 

1$; Cw*Q7Qi,hla 

16; Cw«070?. M« 
17: Cw»0704.hla 
18: Cw*080i.hlft 
19: C*v*0802.1ila 
20: Civ«0803.hla 

21; Q«i2oi,Ma 
22; C w* I^, h ta 



23: Cw«1301.hla 
24: Cw«1402.hla 
25: Cw«1403.hla 
26: Cw«1501.hla 
27; CM"i 502 ,h la 
28: Cw*1503.hla 
29: Cw*1505.hia 
30: Cw*1504.hla 
31: Cw«1601.hla 
32: Cw»1602.hla 
33: Cw*1701.hla 
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[ SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Stevens, John K. 

Dunn, James M. 
Leushner # James 
Green, Ronald 

(ii) TITLE OF INVENTION: Method for Evaluation of 
Polymorphic Genetics Sequences, and Use Thereof in 
Identification of HLA Types 

(iii) NUMBER OF SEQUENCES: 33 
<iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Oppedahl & Larson 

(B) STREET: 1992 Commerce Street Suite 309 

(C) CITY: Yorktowh 

(D) STATE: NY j 

(E) COUNTRY: |US < 

(F) ZIP: 10598 J 

(v) COMPUTER j READABLE FORM: 

(A) MEDIUM TYPE: Diskette - 3.5 inch, 1.44 Mb storage 

(B) COMPUTER : { IBM? compatible 

(C) OPERATING SYSTEM: MS DOS 

(D) SOFTWARE: Word Perfect 

(vi) CURRENT APPLICATION DATA : 
(A) APPLICATION NUMBER: 

<B) FILING DATE: [ 
(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: * 

(viii) ATTORNEY / AGENT INFORMATION : 

(A) NAME: Latson, 5 Marina T. 

(B) REGISTRATION NUMBER: 32,038 

(C) REFERENCE /DOCKET NUMBER: VGEN . P- 019-WO 

(ix) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: (914) 245-3252 

(B) TELEFAX: (914) 962-4330 

(C) TELEX: 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS it double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR1 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
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TTGTGGCAGC TTAAGTTTGA AT 22 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR1 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CCGCCTCTGC TCCAGGAG 18 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR1 : 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CCCGCTCGTC TTCCAGGAT 19 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS:" 

(A) LENGTH : 19 

(B) TYP 1 ^: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE : yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR2 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TCCTGTGGCA GCCTAAGAG 19 * 
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(2) INFORMATION FOR SEQ ID NO;: 5:: * - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IB ! 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE.* other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR2 
alleles of HLA Class II genes \ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:'5: i 
CCGCGCCTGC TCCAGGAT 18 

)■ 

(2) INFORMATION FOR SEQ ID NO: 6: • j 

(i) SEQUENCE CHARACTERISTICS: ] * 

(A) LENGTH: 19 J 

(B) TYPE: nucleic acid I 

(C) STRANDEDNESS: double \ 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no \ 

(iv) ANTI -SENSE: no ^ 

(v) FRAGMENT TYPE: internal 
(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(D) OTHER INFORMATION; amplification primer for DR2 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: ' 
AGGTGTCCAC CGCGCGGCG 19 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human ' 

(D) OTHER INFORMATION: amplification primer for DR3 , 8, 
11, 12, 13, 14 alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
CACGTTTCTT GGAGTACTCT AC 22 

(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 
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(B) TOPE:: nucleic acid 

(C) .STRANDEDNESS: double 

(D) TOPOLOGY.: j linear 

(ii) MOLECULE i TYPE ; other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE; no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR3 # 8, 
.11, 12, 13, 14 alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:8: 
CCGCTGCACT GTGAAGCTCT 20 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: \ linear 

(ii) MOLECULE! TYPE : other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL* SOURCE: 
(A) ORGANISM:? human 

(D) OTHER INFORMATION: amplification primer for DR4 
alleles of HLA Class II genes 

(xi) SEQUENCE: DESCRIPTION: SEQ ID NO: 9: 
GTTTCTTGGA GCAGGTTAAA CA 22 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(v) FRAGMENT 1TYPE : internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM:' human 

(D) OTHER INFORMATION: amplification primer for DR4 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTGCACTGTG AAGCTCTCAC 20 



(2) INFORMATION FOR SEQ ID NO: 11: 
( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR4 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CTGCACTGTG AAGCTCTCCA 20 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 

(B) 'nrPE; nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR7 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCTGTGGCAG GGTAAGTATA 20 

(2) INFORMATION FOR SEQ ID NO: 13- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
ID) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR7 
alleles of HLA Class II genes 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CCCGTAGTTG TGTLxGCACA C 21 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : other nucleic acid 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSE; yes J . 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE; j 

(A) ORGANISM: human Q 
(D) OTHER INFORMATION: amplification primer for DR9 

alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 

GTTTCTTGAA GCAGGATAAG TTT 23 



(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: : 

(A) LENGTH: 21 

(B) TYPE: nucleic acid ; 

(C) STRANDEDNESS : double ; j 

(D) TOPOLOGY: linear ' ; 

(ii) MOLECULE TYPE: other nucleic acid ; 

(iii) HYPOTHETICAL: no ; j 

(iv) ANTI-SENSE: no i £ 

(v) FRAGMENT TYPE: internal j ; 

(vi) ORIGINAL SOURCE: ; j 

(A) ORGANISM: human i ^ nr . n pQ 

(D) OTHER INFORMATION: amplification primer for DR9 

alleles of HLA Class II genes , 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15': 

CCCGTAGTTG TGTCTGCACA C , 21 [ 

(2) INFORMATION FOR SEQ ID NO: 16: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 j ■ 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double ; 

(D) TOPOLOGY: linear • * > 

(ii) MOLECULE TYPE: other nucleic acid; 

(iii) HYPOTHETICAL: no 

(iv) ANTI SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION : amplification primer for DR10 
alleles of HLA Class II genes . 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CGGTTGCTGG AAAGACGCG 19 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM; human 

(D) OTHER INFORMATION: amplification primer for DR10 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CTGCACTGTG AAGCTCTCAC 20 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 17 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 
<iv) ANTI-SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human ^n^i^o 
(D) OTHER INFORMATION: sequencing primer for DR alleles 

of HLA Class II genes 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GAGTGTCATT TCTTCAA 17 



(2) INFORMATION FOR SEQ ID NO: 19: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS:) double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human t . UT , r 

(D) OTHER INFORMATION: amplification primer for hla-u 

gene ( exon 2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AGCGAGTGCC CGCCCGGCGA 20 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: 'no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
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$ S'lNFoKlON: an^lif ication primer for HIA-C . ... 

9 (Ki ISSeNCE DESCRIPTION: SEQ ID W):20 3 ■ 
ACCTGGCCCG TCCGTGGGGG ATGAG ^ 

(2) INFORMATION FOR SEQ ID NO: 21: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

& ( (D) SS'lNFoSlON: amplification primer for HLA-C : 

9 (xii ESSENCE DESCRIPTION: SEQ ID NO:21: 
GACCGCGGGG CCGGGGCCAG GG 

(2) INFORMATION FOR SEQ ID NO: 22: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear . ., . 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

S! 2tS IS INFoS?ON: amplification primer for HLA-C 

® KSuENCE DESCRIPTION: SEQ ID NO: 22: 

GGAGATGGGG AAGGCTCCCC ACT ^ J 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS.: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid r 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL :no 

(iv) ANTI -SENSE: yes 

i (v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
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(D) OTHER INFORMATION: forward sequencing primer for 
HLA-C gene, exon 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
CCGGGGCGCA GGTCACGA 18 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: forward sequencing primer for 
HLA-C gene, exon 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24; 
GGAGGGTCGG GCGGGTCT 18 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: reverse sequencing primer for 
HLA-C gene, exon 3 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
CGGGACGTCG CAGAGGAA 18 

(2) INFORMATION FOR SEQ ID NO: 26; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human j 
(D) OTHER INFORMATION: amplification primer for exon 6 
of lipoprotein lipase gene 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:26: 
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GCCGAGATAC AATCTTGGTG 20 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE : 
(A) ORGANISM: human 

(D) ITHER INFORMATION: amplification primer for exon 6 
of lipoprotein lipase gene 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:27: 
CAGGTACATT TTGCTGCTTC 20 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE : yes 

(v) FRAGMENT TYPE; internal 

(vi) ORIGINAL' SOURCE: 
(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: amplification primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ IDNO:28: 
ACCACTTGGT GTGACGCTAT CAG 23 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL : no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia. 

(D) OTHER INFORMATION: amplifi cation primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CGGAATTGTG CATTTACGTG AG 22 



BNSDOCIO: <WO 97236S0A2_L> 



"WO 97/23*50 '•■ PCT/US96/20202 

• - 60 - 

(2) INFORMATION POR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: yes 

(v) -FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: amplification primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CCGACCGCPT CTTGAAAACA GATGT 25 



(2) ; INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 21 

: (B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia 

'(D) OTHER INFORMATION: amplification primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CACCCACATT CCCAGAGAGC T 21 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: amplification primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
CGTGCAGCTT TGTGGGAATG T 21 
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33: 



(2) INFORMATION FOR SEQ ID NO: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH- 24 

(B) TYPE: nucleic acxa 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE.: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

SI K^oMtWm^- Pri-r for 
"SSPwiiSS ffianmc.: SEQ ID »>:». 

CTAGATTTCA TCTTGTTCRA TTGC 
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C LA IMS 

1 . A method for identification of allelic type of a known polymorphic 
genetic locus in a sample comprising the steps of: 

(a) combining the sample with a sequencing reaction mixture containing 
a template-dependent nucleic acid polymerase. A, T, G and C nucleotide feedstocks, 
one type of chain terminating nucleotide and ^sequencing primer under conditions 
suitable for template dependant primer extensibn to form a plurality of oligonucleotide 
fragments of differing lengths, the lengths of said fragments indicating the positions of 
the type of base corresponding to the chain terminating nucleotide in the extended 
primer; and J \ 

(b) evaluating the length? of thebligonucleotide fragments thereby 
determining the position of the positions of the type of base corresponding to the chain 
terminating nucleotide in the extended primer, characterized in that herein the sample is 
concurrently combined with at most three sequencing reaction mixtures containing 
different types of chain terminating nucleotides. 

2. The method of clainr 1 , wherein the sample is combined with a 
single sequencing reaction mixture containing at most two chain terminating 
nucleotides, and the lengths of the oligonucleotide fragments produced are evaluated 
prior to combining the sample with any further sequencing reaction mixture. 

3. The method of claim I, wherein the sample is combined with a 
single sequencing reaction mixture containing only one chain terminating nucleotide, 
and the lengths of the oligonucleotide fragments produced are evaluated prior to 
combining the sample with any further sequencing reaction mixture. 

4. The method of any of cla ; ms I to 3, wherein the sample is amplified 
prior to combining it with the sequencing reaction mixture to enrich the amount of the 
polymorphic genetic locus 

5. The method of claim 4 V wherein the amplification is performed using 
polymerase chain reaction amplification. 
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6, The method of any of claims 1 lo 5, characterized in that the length 
of the oligonucleotide fragments is evaluated by electrophoretic separation on a 
denaturing gel. 

7. A kit for identification of allelic type of a polymorphic genetic locus 
in a sample comprising, in packaged combination, 

\ (a) a sequencing primer adapted to hybrid ze to genetic material in the 

sample near the polymorphic genetic locus; and 

; (b) two or more chain ten '.nating nucleotides, wherein a first of said 

i 

chain terminating nucleotides is provided in an amount which is five or more times 
greater than the amount of any other chain terminating nucleotide. 

I 8 The kit of claim .7, wherein the first chain terminating nucleotide is 

dideoxyadenosine. 

' 9. The kit of claim. 7, wherein the first chain terminating nucleotide is 

dideoxycytosine. 

10. The kit of claim 7, wherein the first chain terminating nucleotide is 
dideoxythymine. 

; II. The kit of claim 7 t wherein the first chain terminating nucleotide is 

dideoxyguanosine. 

12. A method for determining the allelic type of a polymorphic gene in a 
sample comprising the steps of: 

(a) combining a first aliquot of the sample with a first sequencing 
reaction mixture containing a template-dependent nucleic acid polymerase, A, T, G and 
C nucleotide feedstocks, a first type of chain terminating nucleotide and a sequencing 
primer under conditions suitable for template dependant primer; extension to r ~-n> a 
plurality of oligonucleotide fragments of differing lengths, the lengths of said fragments 
indicating the positions of the type of base corresponding to the first type of chain 
terminating nucleotide in the extended primer; 

(b) evaluating the length of the oligonucleotide fragments to determine 
the positions of the type of base corresponding to the first type of chain terminating 
nucleotide in the extended primer; and 
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(c) comparing the positions of the type of base corresponding to the 
first type of chain terminating nucleotide in the extended primer tothe positions found 
in known alleles of the gene whereby the sample can either be assigned as being of a 
particular type or is assigned as ambiguous for further evaluation. 

13. The method of claim 12, wherein the sample is ambiguous after 
comparing the positions of the type of base corresponding to the first type of chain 

terminating nucleotide in the extended primer to the positions found in known alleles 

i i 

of the gene, further comprising the steps of 

{ 

combining a second aliquot of the sample with a second sequencing 5 

\ jj 

reaction mixture containing a template-dependent nucleic acid polymerase, ; A, T, G and 

; i 

C nucleotide feedstocks, a second type of chain terminating nucleotide, different from 
said first type, and a sequencing primer under conditions suitable for template 
dependant primer extension to form a plurality of oligonucleotide fragments of ; 
differing lengths, the lengths of said fragments indicating the positions of the type bf 
base corresponding to the second type of chain terminating nucleotide in the extended 
primer; 

evaluating the length of the oligonucleotide fragments to determine the 
positions of the type of base corresponding to the second type of chain terminating 
nucleotide in the extended primer; and 

comparing the positions of the type of base corresponding to the first and 
second types of chain terminating nucleotide in the extended primer to the positions 
found in known alleles of the gene whereby the sample can either be assigned as being 
of a particular type or is assigned as ambiguous for further evaluation 

14. The method of claim 13, wherein the sample is ambiguous after 
comparing the positions of the type of base corresponding to the first and second types 
of chain terminating nucleotide in the extended primer to the positions found in known 
alleles of the gene, further comprising the steps of 

combining a third aliquot of the sample with a third sequencing reaction 
mixture containing a template-dependent nucleic acid polymerase, A, T, G and C ; 
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i 

*. 

nucleotide feedstocks, a third type of chain terminating nucleotide, different from said 
first and second types, and a sequencing primer under conditions suitable for template 
dependant primer extension to form a plurality of oligonucleotide fragments of 
differing lengths, the lengths of said fragments indicating the positions of the type of 
base corresponding to the third type of chain terminating nucleotide in the extended 
primer; • 

evaluating the length of the oligonucleotide fragments to determine the 
positions of the type of base corresponding to the third type of chain terminating 
nucleotide in the extended primer; and f 

comparing the positions of t|e type of base corresponding to the first, 
second and third types of chain terminating nucleotide in the extended primer to the 
positions found in known alleles of the gene whereby the sample can either be assigned 
as being of a particular type or is assigned as ambiguous for further evaluation. 

i 
r, 

15. The method of claiml 4, wherein the sample is ambiguous after 
comparing the positions of the type of base corresponding to the first, second and third 
types of chain terminating nucleotide in the extended primer to the positions found in 

: i 

known alleles of the gene, further comprising the steps of 

combining a fourth aliquot of the sample with a fourth sequencing reaction 
mixture containing a template-dependent nucleic acid polymerase, A, T, G and C 
nucleotide feedstocks, a fourth type of chain terminating nucleotide, different from said 
first, second and third type, and a sequencing primer under conditions suitable for 
template dependant primer extension to; form a plurality of oligonucleotide fragments 
of differing lengths, the lengths of said fragments indicating the positions of the type of 
base corresponding to the fourth type of chain terminating nucleotide in the extended 
primer; 

evaluating the length of the oligonucleotide fragments to determine the 
positions of the type of base corresponding to the fourth type of chain terminating 
nucleotide in the extended primer; and ; 
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comparing the positions of the type of base corresponding to the first, 
second, third and fourth types of chain terminating nucleotide in the extended primer to 
the positions found in known alleles of the gene whereby the sample can either be 
assigned as being of a particular type or is assigned as ambiguous for further 
evaluation. 

1 6. The method of any of claims ] 2 to 15, wherein the sample is 
amplified prior to combining it with the sequencing reaction mixture to enrich the 
amount of the polymorphic genetic locus. 

1 7. The method of claim 1 6, wherein the amplification is performed 
using polymerase chain reaction amplification. 

1 8. The method of any of claims 32 to 1 7, wherein the gene is an HLA 
Class I gene, 

' 1 9. The method of any of claims 12 to 17, wherein the gene is an HLA 
Class II gene. 



97236 50 A2J_> 



WO 97/23650 



1/7 



PO7US96/20202 

J 



GENE XY21 nt 


100 


101 


102 




A 




A 


A 




A 




T 


A 




A 




c 


A 




FIG. 1 






OBSERVED DATA 










4 nts 


G 


C 


A 


(170) 


ALLELE 12 


G 


C 


A 


(T :) 


ALLELE 13 


G 


C 


A 


(A ) 


ALLELE 14 


G 


C 


A 


(T ) 


ALLELE 15 


G 


C 


A 


(A ) 



ALLELE 1 
ALLELE 2 



ALLELE 3 



T ?(g/c) A 



T 
T 

T 
T 



IG ) 

kc ) 



A 
A 



<C ) A 

!(G ) A 



IMPOSSIBLE TO DISTINGUISH HETEROZYGOTE PAIR BY DNA 
SEQUENCING ALONE ; 

FIG. 3 



CONTROL 
LANE(4nt) 

A LANE 

SEQUENCE 

(SAMPLE) 

TEXT 
FILE FOR 
RESULTS 



I I I 



AUTO- 
!>RADIOGRAPH 
RESULTS 



A AA A AAA A-- 

FIG. 4 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <W0 9723650A2_I_> 



WO 97/23650 

'observed data 

FIG. 2A<j ALLELE 1 
ALLELE 2 



PCT/US96/70202 



OBSERVED DATA 
ALLELE 3 

FIG. 2B<^ ALLELE 4 

ALLELE 5 
ALLELE 6 
ALLELE 7 



2/7 

A A- -A- A 

A--A-- 

A A - A 

A AA- A 

- AAA-- A " 

A--A--. 

_- A A' 

A A 

A A-- 



ELIMINATED 



POSSIBLE 



FIG. 2C< 



OBSERVED DATA 
ONE nt ONLY 

ALLELE 8 
ALLELE 9 

ALLELE 10 
ALLELE 11 

OBSERVED DATA 
TWO nt ONLY 

ALLELE 8 
ALLELE 9 

ALLELE 10 
ALLELE II 



-A AA- A 

-A 



A 

AA — . 



A-- 

-A — A. 



UNCLEAR WHICH 
ALLELIC PAIR 
IS CORRECT 



• A - T - A A- A 
-A-- 



~~ TA ~] INCORRECT 
A A--J 



-A-T — A--1 CORRECT 
A--AJ 



OBSERVED DATA -A- A A 



POSSIBILITIES: 



FIG. 2D< ALLELE A 
] ALLELE B 
ALLELE C 



. A - A A - HOMOZYGOTE 

-A -A- 
A-- 

HETEROZYGOTE 

HETEROZYGOTE- 




SUBSTfTUTE SHEET (RULE 2$ 



BNSDOCID: <WO 9723650A2_I_> 




BNSDOCID: <WO. 



9723650A2_L> 



AVO 97/23650 



4/7 



^-PEAK 
PATIENT I MAXIMUM 
SAMPLE 




KNOW* 1 
ALLELE 

ABSENCE OF PEAK IDENTIFIED BY COMPARISON OF AREAS 
UNDER THE CURVE FOR EACH PEAK 

FIG. 6 



8h 
7 
6 - 

NUMBER 5 

OF PEAKS „ 

4 - 

2 
1 



POSSIBLE RANGE 
OF SINGLE NUCLEOTIDE 
SEPARATIONS 

a 



10 15 20 25 30 35 4 0 45 50 55 60 

peak separation 
[data points] \ 



FIG. 7 



SUBSTITUTE SHEET (RULE 2$) 



WO 97/23650 



5/7 
FIGURE 8A 



PCT/US96£0202 



C. trachomatis ompl (VD1) genotype identification. 
Possible T Termination Reaction results 



256 78901234567890123456789 280 

Group 1 

B T T - - T 

Ba T T - - T 

D T T--T 

E T T--T 

LI T T - - T 

L2 T T - - T 

Group 2 

P T T T " - - - - - - - - T T ~ T " - 

G TTT T T - T - - - - 

Group 3 

C - T--T T - T * - 

A T - - T T-T-- 

I - - - T - - T T-T-- 

J T - - T T-T-- 

K T - - T T - T - - 

L3 - T - - T - T - T - - 

Observed Results 

Sample 1 - - - - - - - T - - T - - T - - T * - T - T - - 

Sample 2 ------- T T - - T - 



WO 97/23650 



PCT/US96/20202 



6/7 





•c 


< <t < < 


;| :| 


111 1 111 






ro 


1 1 :l 1 :! I 


1 1 


< ct < < < < < 


< 1 




CM 


f 

t Mil II 


< < 


< <I < <t <£ < <t 


j< 1 
< 1 




T— 


111111 


1 1 


d « < < « << 


< 1 




o 
cn 


(III II 


1 1 


< <I <t < ci < 1 


< l 




o> 


1 i I 1 1 i 


1 ' 


1 l 1 ( ( 1 I 


1 ) 




OD 


< <t« 


1 1 


<!<<<<<< 






N- 


1 1 t 1 1 t 


1 1 


| 1 1 1 1 t i 


1 I 




CD 


i <r< <i i i 


< < 


1 1 1 1 I I 1 






m 


I T; I I 1 1 


1 1 


1 1 1 i 1 1 1 


1 1 






i I < i i l i 

4 


1 1 


I I < 1 1 1 1 


1 t 




ro 


' i K i 1 i 


<< 


1 1 1 1 1 1 1 


1 1 




cvj 


i i ; i i i i 


1 1 


<<<<<!< 1 


< 1 




T— 




1 1 1 1 1 < 1 


< <L 




o 

CO 


f 

; <i <#<t ci <t < 


< < 


1111,11 


< <X 




0) 


; i ih i i i 


1 1 


<<<<<<< 


< l 




co: 


i I ? 1 1 1 1 


' 1 


1,11111 


1 1 






;. 1 HI 1 ( 1 


1 1 


llllll <H 


1 1 




CD 


< <'< < < < 


1 1 


1 1 1 1 1 1 1 


<l <t 






: I i \ 1 1 1 1 


1 1 


<L < <£ < <t < <C 








M. <4. ^4. M. 


, 1 


1 | i | [ | | 


< < 


CO 


ro 


1 IM 1 1 1 


1 1 


1111111 


i 1 


L. 




; i n< i i i 


1 1 


1 1 < < t 1 1 


1 1 


CO 




< « <C<£ <t 


1 1 






UJ 




i u i i i 




III 1 111 


1 1 


cr 










1 1 




cn 


i i : i i i i 


1 1 


llllll) 


1 i 


z 
o 


00 


.w xf -«< 




<"T eT **"T *-T **T <*T 
C4.C4.Ci. C4. ^4. *4. 




ACTI 




< <?<t <t <:< 


1 1 


11,1111 


< < 


CD 


iiiiii 


1 t 


1 1 1 1 1 1 1 


1 1 


Id 
cr 


in 


« <<t << < 


1 i 


<<<<<<< 


<t ct 


Z 




{ i i i i i 




1 1 1 1 1 l 1 


1 1 


o 


ro 


i i i i i i 


1 1 


1 1 1 1 1 1 1 


1 1 


Si 


0J 


i iii it 


1 




1 1 

1 I 






i i i i i 


1 1 




to 1 1 




o 

CD 
0) 


<<<<<< 


1 1 


111 1 1 1 < 


!j < < 


cr 

UJ 




1 1 


1(1 1 111 






CD 


| III I I 


1 1 


III 1 lit 


m 1 1 












< 




1 1 1 < 1 1 


< < 


1 1 t 1 1 1 1 






CD 


llllll 




1 1 1 I 1 1 1 

ro 
CL 


RVEI 
LE1 - 
LE2- 


IBL 


in 


a. 


CV1 

0. 


(f) 




D 


D 


D 


uj q_ a. 


to 




O 


O 


O 




o 




£E O r- CM 
O CD mO LiJ 1 1 


q: 


cr ro 


m <t < 


CL 




O II O 




O CO (f) 



CD 
GO 

O 



SUBSTITUTE SHEET (RULE 26J 



■ BNSDOCID: <WO 9723650A2_I_> 



PCT/US96/20202 

"WO 97&3650 

7/7 

POSSIBLE C TERMINATION REACTION RESULTS 

„,„ RC 7RC>01 23456789 280 
256 7 8 9 0 1 234 5 6 7 8 9 01 <L o a 

t :||--:--||"r::| : .E! : -EUEEr. 

D "CC CC--C C c _ 

E ~CC CC--C- C - 

L1 -CC---CC--C--C C__C___ c _ 

L2 -CC---CC--C--C C 

F c cc c--c--- 

l c cc c--c- 

r -P-CC--C--CC--C C 

C -C- -C ^__r--CC--C C 

A "C--C-CC ;__pr__c C--C 

H -C--C-CC--C- CC 0 c 

I -C--C-CC--C--CC 

1 -C--C-CC--C--CCC 

K - c - - c - c c - - c - - c c c _ _ c 
« 3 - c cc — CC-C--CC--C- c c 



POSSIBLE G TERMINATION REACTION RESULTS 

256 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 23 45 6 7 8 9 280 

— G 6 

B G g - - G G 

BO G G ______ G - G G - - 

D G G--- 66-- 

E G G G G G 

L1 G G G_ GG __ 

L2 G G 0 

r - - R ft - G GGG- 

F G-GG G - G G G 

G G-GG G G G b 

. G-G G G 

C G-GG-G G-G--G--G 

A G-GG-G _____ G --G 

H G-GG-G GG--G--G 

I G-GG-G------ ___ G _e__ 6 --G 

j - — g -g--g--g 

kg-gg-g- 

L3 G--G-G 



G-G 



FIG. 8C 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 97236 50A2J_> 



